List:Commits« Previous MessageNext Message »
From:plavin Date:December 3 2007 6:47pm
Subject:svn commit - mysqldoc@docsrva: r9067 - in trunk: dynamic-docs/command-optvars refman-4.1 userguide
View as plain text  
Author: plavin
Date: 2007-12-03 18:47:22 +0100 (Mon, 03 Dec 2007)
New Revision: 9067

Log:
Rename se-innodb to se-innodb-core because of dynamic content


Removed:
   trunk/refman-4.1/se-innodb.xml
Renamed/Moved:
   trunk/refman-4.1/se-innodb-core.xml (from rev 9063, trunk/refman-4.1/se-innodb.xml)
Modified:
   trunk/dynamic-docs/command-optvars/mysqld.xml
   trunk/refman-4.1/storage-engines.xml
   trunk/userguide/about.xml


Modified: trunk/dynamic-docs/command-optvars/mysqld.xml
===================================================================
--- trunk/refman-4.1/se-innodb-core.xml	                        (rev 0)
+++ trunk/refman-4.1/se-innodb-core.xml	2007-12-03 17:47:22 UTC (rev 9067)
Changed blocks: 4, Lines Added: 7639, Lines Deleted: 4; 284985 bytes

@@ -7832,7 +7832,6 @@
       <manual version="4.1"/>
       <manual version="5.0"/>
       <manual version="5.1"/>
-      <removed version="5.1.18"/>
       <manual version="6.0"/>
     </versions>
 

@@ -8067,7 +8066,6 @@
       <manual version="4.1"/>
       <manual version="5.0"/>
       <manual version="5.1"/>
-      <removed version="5.1.18"/>
       <manual version="6.0"/>
     </versions>
 

@@ -8135,7 +8133,6 @@
       <manual version="5.0"/>
       <introduced version="5.0.3"/>
       <manual version="5.1"/>
-      <removed version="5.1.18"/>
       <manual version="6.0"/>
     </versions>
 

Copied: trunk/refman-4.1/se-innodb-core.xml (from rev 9063,
trunk/refman-4.1/se-innodb.xml)
===================================================================
--- trunk/refman-4.1/se-innodb-core.xml	                        (rev 0)
+++ trunk/refman-4.1/se-innodb-core.xml	2007-12-03 17:47:22 UTC (rev 9067)

@@ -0,0 +1,7638 @@
+<?xml version="1.0" encoding="utf-8"?>
+<!DOCTYPE section PUBLIC "-//OASIS//DTD DocBook XML V4.3//EN"
"http://www.oasis-open.org/docbook/xml/4.3/docbookx.dtd" [
+<!ENTITY % all.entities SYSTEM "all-entities.ent">
+  %all.entities;
+]>
+<section id="innodb">
+
+  <title>The <literal>InnoDB</literal> Storage Engine</title>
+
+  <indexterm>
+    <primary>InnoDB storage engine</primary>
+  </indexterm>
+
+  <indexterm>
+    <primary>tables</primary>
+    <secondary>InnoDB</secondary>
+  </indexterm>
+
+  <remark role="todo">
+    A number of node names in this chapter are very generic (e.g.,
+    "Backing up") and could apply in non-InnoDB contexts. Might be best
+    to rename a bunch of them.
+  </remark>
+
+  <section id="innodb-overview">
+
+    <title><literal>InnoDB</literal> Overview</title>
+
+    <indexterm>
+      <primary>InnoDB</primary>
+    </indexterm>
+
+    <indexterm>
+      <primary>transactions</primary>
+      <secondary>support</secondary>
+    </indexterm>
+
+    <indexterm>
+      <primary>transaction-safe tables</primary>
+    </indexterm>
+
+    <indexterm>
+      <primary>ACID</primary>
+    </indexterm>
+
+    <para>
+      <literal>InnoDB</literal> provides MySQL with a transaction-safe
+      (<literal>ACID</literal> compliant) storage engine that has
+      commit, rollback, and crash recovery capabilities.
+      <literal>InnoDB</literal> does locking on the row level and also
+      provides an Oracle-style consistent non-locking read in
+      <literal>SELECT</literal> statements. These features increase
+      multi-user concurrency and performance. There is no need for lock
+      escalation in <literal>InnoDB</literal> because row-level locks
+      fit in very little space. <literal>InnoDB</literal> also supports
+      <literal>FOREIGN KEY</literal> constraints. You can freely mix
+      <literal>InnoDB</literal> tables with tables from other MySQL
+      storage engines, even within the same statement.
+    </para>
+
+    <para>
+      <literal>InnoDB</literal> has been designed for maximum
+      performance when processing large data volumes. Its CPU efficiency
+      is probably not matched by any other disk-based relational
+      database engine.
+    </para>
+
+    <para>
+      Fully integrated with MySQL Server, the <literal>InnoDB</literal>
+      storage engine maintains its own buffer pool for caching data and
+      indexes in main memory. <literal>InnoDB</literal> stores its
+      tables and indexes in a tablespace, which may consist of several
+      files (or raw disk partitions). This is different from, for
+      example, <literal>MyISAM</literal> tables where each table is
+      stored using separate files. <literal>InnoDB</literal> tables can
+      be of any size even on operating systems where file size is
+      limited to 2GB.
+    </para>
+
+    <para>
+      <literal>InnoDB</literal> is included in binary distributions by
+      default as of MySQL 4.0. For information about
+      <literal>InnoDB</literal> support in MySQL 3.23, see
+      <xref linkend="innodb-in-mysql-3-23"/>. Starting from MySQL 4.1.5,
+      the improved Windows installer makes <literal>InnoDB</literal> the
+      MySQL default storage engine on Windows.
+    </para>
+
+    <para>
+      <literal>InnoDB</literal> is used in production at numerous large
+      database sites requiring high performance. The famous Internet
+      news site Slashdot.org runs on <literal>InnoDB</literal>. Mytrix,
+      Inc. stores over 1TB of data in <literal>InnoDB</literal>, and
+      another site handles an average load of 800 inserts/updates per
+      second in <literal>InnoDB</literal>.
+    </para>
+
+    <para>
+      <literal>InnoDB</literal> is published under the same GNU GPL
+      License Version 2 (of June 1991) as MySQL. For more information on
+      MySQL licensing, see
+      <ulink url="http://www.mysql.com/company/legal/licensing/"/>.
+    </para>
+
+    <para>
+      A forum dedicated to the <literal>InnoDB</literal> storage engine
+      is available at <ulink url="&base-url-forum-list;?22"/>.
+    </para>
+
+  </section>
+
+  <section id="innodb-contact-information">
+
+    <title><literal>InnoDB</literal> Contact Information</title>
+
+    <para>
+      Contact information for Innobase Oy, producer of the
+      <literal>InnoDB</literal> engine:
+    </para>
+
+<programlisting>
+Web site: <ulink url="http://www.innodb.com/"/>
+Email: <email>sales@stripped</email>
+Phone: +358-9-6969 3250 (office)
+       +358-40-5617367 (mobile)
+
+Innobase Oy Inc.
+World Trade Center Helsinki
+Aleksanterinkatu 17
+P.O.Box 800
+00101 Helsinki
+Finland
+</programlisting>
+
+  </section>
+
+  <section id="innodb-in-mysql-3-23">
+
+    <title><literal>InnoDB</literal> in MySQL 3.23</title>
+
+    <para>
+      Beginning with MySQL 4.0, <literal>InnoDB</literal> is enabled by
+      default, so the following information applies only to MySQL 3.23.
+    </para>
+
+    <para>
+      <literal>InnoDB</literal> tables are included in the MySQL source
+      distribution starting from 3.23.34a and are activated in the
+      MySQL-Max binaries of the 3.23 series. For Windows, the MySQL-Max
+      binaries are included in the standard distribution.
+    </para>
+
+    <para>
+      If you have downloaded a binary version of MySQL that includes
+      support for <literal>InnoDB</literal>, simply follow the
+      instructions of the MySQL manual for installing a binary version
+      of MySQL. If you have MySQL 3.23 installed, the simplest way to
+      install MySQL-Max is to replace the executable
+      <command>mysqld</command> server with the corresponding executable
+      from the MySQL-Max distribution. MySQL and MySQL-Max differ only
+      in the server executable. See <xref linkend="installing-binary"/>,
+      and <xref linkend="mysqld-max"/>.
+    </para>
+
+    <para>
+      To compile the MySQL source code with <literal>InnoDB</literal>
+      support, download MySQL 3.23.34a or newer from
+      <ulink url="http://www.mysql.com/"/> and configure MySQL with the
+      <option>--with-innodb</option> option. See
+      <xref linkend="installing-source"/>.
+    </para>
+
+    <para>
+      To use <literal>InnoDB</literal> tables with MySQL 3.23, you must
+      specify configuration parameters in the
+      <literal>[mysqld]</literal> section of the
+      <filename>my.cnf</filename> option file. On Windows, you can use
+      <filename>my.ini</filename> instead. If you do not configure
+      <literal>InnoDB</literal> in the option file,
+      <literal>InnoDB</literal> does not start. (From MySQL 4.0 on,
+      <literal>InnoDB</literal> uses default parameters if you do not
+      specify any. However, to get best performance, it is still
+      recommended that you use parameters appropriate for your system,
+      as discussed in <xref linkend="innodb-configuration"/>.)
+    </para>
+
+    <para>
+      In MySQL 3.23, you must specify at the minimum an
+      <literal>innodb_data_file_path</literal> value to configure the
+      <literal>InnoDB</literal> data files. For example, to configure
+      <literal>InnoDB</literal> to use a single 500MB data file, place
+      the following setting in the <literal>[mysqld]</literal> section
+      of your option file:
+    </para>
+
+<programlisting>
+[mysqld]
+innodb_data_file_path=ibdata1:500M
+</programlisting>
+
+    <para>
+      <literal>InnoDB</literal> creates the
<filename>ibdata1</filename>
+      file in the MySQL data directory by default. To specify the
+      location explicitly, specify an
+      <literal>innodb_data_home_dir</literal> setting. See
+      <xref linkend="innodb-configuration"/>.
+    </para>
+
+  </section>
+
+  <section id="innodb-configuration">
+
+    <title><literal>InnoDB</literal> Configuration</title>
+
+    <para>
+      To enable <literal>InnoDB</literal> tables in MySQL 3.23, see
+      <xref linkend="innodb-in-mysql-3-23"/>.
+    </para>
+
+    <para>
+      From MySQL 4.0 on, the <literal>InnoDB</literal> storage engine is
+      enabled by default. If you don't want to use
+      <literal>InnoDB</literal> tables, you can add the
+      <literal>skip-innodb</literal> option to your MySQL option file.
+    </para>
+
+    <note>
+      <para>
+        <literal>InnoDB</literal> provides MySQL with a transaction-safe
+        (<literal>ACID</literal> compliant) storage engine that has
+        commit, rollback, and crash recovery capabilities.
+        <emphasis role="bold">However, it cannot do so</emphasis> if the
+        underlying operating system or hardware does not work as
+        advertised. Many operating systems or disk subsystems may delay
+        or reorder write operations to improve performance. On some
+        operating systems, the very system call that should wait until
+        all unwritten data for a file has been flushed &mdash;
+        <literal>fsync()</literal> &mdash; might actually return before
+        the data has been flushed to stable storage. Because of this, an
+        operating system crash or a power outage may destroy recently
+        committed data, or in the worst case, even corrupt the database
+        because of write operations having been reordered. If data
+        integrity is important to you, you should perform some
+        <quote>pull-the-plug</quote> tests before using anything in
+        production. On Mac OS X 10.3 and up, <literal>InnoDB</literal>
+        uses a special <literal>fcntl()</literal> file flush method.
+        Under Linux, it is advisable to <emphasis role="bold">disable
+        the write-back cache</emphasis>.
+      </para>
+    </note>
+
+    <para>
+      On ATAPI hard disks, a command such <literal>hdparm -W0
+      /dev/hda</literal> may work to disable the write-back cache.
+      <emphasis role="bold">Beware that some drives or disk controllers
+      may be unable to disable the write-back cache.</emphasis>
+    </para>
+
+    <para>
+      Two important disk-based resources managed by the
+      <literal>InnoDB</literal> storage engine are its tablespace data
+      files and its log files.
+    </para>
+
+    <note>
+      <para>
+        If you specify no <literal>InnoDB</literal> configuration
+        options, MySQL 4.0 and above create an auto-extending 10MB data
+        file named <filename>ibdata1</filename> and two 5MB log files
+        named <filename>ib_logfile0</filename> and
+        <filename>ib_logfile1</filename> in the MySQL data directory.
+        (In MySQL 4.0.0 and 4.0.1, the data file is 64MB and not
+        auto-extending.) In MySQL 3.23, <literal>InnoDB</literal> does
+        not start if you provide no configuration options. To get good
+        performance, you should explicitly provide
+        <literal>InnoDB</literal> parameters as discussed in the
+        following examples. Naturally, you should edit the settings to
+        suit your hardware and requirements.
+      </para>
+    </note>
+
+    <para>
+      The examples shown here are representative. See
+      <xref linkend="innodb-parameters"/> for additional information
+      about <literal>InnoDB</literal>-related configuration parameters.
+    </para>
+
+    <para>
+      To set up the <literal>InnoDB</literal> tablespace files, use the
+      <literal>innodb_data_file_path</literal> option in the
+      <literal>[mysqld]</literal> section of the
+      <filename>my.cnf</filename> option file. On Windows, you can use
+      <filename>my.ini</filename> instead. The value of
+      <literal>innodb_data_file_path</literal> should be a list of one
+      or more data file specifications. If you name more than one data
+      file, separate them by semicolon
+      (<quote><literal>;</literal></quote>) characters:
+    </para>
+
+<programlisting>
+innodb_data_file_path=<replaceable>datafile_spec1</replaceable>[;<replaceable>datafile_spec2</replaceable>]...
+</programlisting>
+
+    <para>
+      For example, a setting that explicitly creates a tablespace having
+      the same characteristics as the MySQL 4.0 default is as follows:
+    </para>
+
+<programlisting>
+[mysqld]
+innodb_data_file_path=ibdata1:10M:autoextend
+</programlisting>
+
+    <para>
+      This setting configures a single 10MB data file named
+      <filename>ibdata1</filename> that is auto-extending. No location
+      for the file is given, so by default, <literal>InnoDB</literal>
+      creates it in the MySQL data directory.
+    </para>
+
+    <para>
+      Sizes are specified using <literal>M</literal> or
+      <literal>G</literal> suffix letters to indicate units of MB or GB.
+    </para>
+
+    <para>
+      A tablespace containing a fixed-size 50MB data file named
+      <filename>ibdata1</filename> and a 50MB auto-extending file named
+      <filename>ibdata2</filename> in the data directory can be
+      configured like this:
+    </para>
+
+<programlisting>
+[mysqld]
+innodb_data_file_path=ibdata1:50M;ibdata2:50M:autoextend
+</programlisting>
+
+    <para>
+      The full syntax for a data file specification includes the
+      filename, its size, and several optional attributes:
+    </para>
+
+<programlisting>
+<replaceable>file_name</replaceable>:<replaceable>file_size</replaceable>[:autoextend[:max:<replaceable>max_file_size</replaceable>]]
+</programlisting>
+
+    <para>
+      The <literal>autoextend</literal> attribute and those following
+      can be used only for the last data file in the
+      <literal>innodb_data_file_path</literal> line.
+      <literal>autoextend</literal> is available starting from MySQL
+      3.23.50 and 4.0.2.
+    </para>
+
+    <para>
+      If you specify the <literal>autoextend</literal> option for the
+      last data file, <literal>InnoDB</literal> extends the data file if
+      it runs out of free space in the tablespace. The increment is 8MB
+      at a time by default. It can be modified by changing the
+      <literal>innodb_autoextend_increment</literal> system variable.
+    </para>
+
+    <para>
+      If the disk becomes full, you might want to add another data file
+      on another disk. Instructions for reconfiguring an existing
+      tablespace are given in <xref linkend="adding-and-removing"/>.
+    </para>
+
+    <para>
+      <literal>InnoDB</literal> is not aware of the filesystem maximum
+      file size, so be cautious on filesystems where the maximum file
+      size is a small value such as 2GB. To specify a maximum size for
+      an auto-extending data file, use the <literal>max</literal>
+      attribute. The following configuration allows
+      <filename>ibdata1</filename> to grow up to a limit of 500MB:
+    </para>
+
+<programlisting>
+[mysqld]
+innodb_data_file_path=ibdata1:10M:autoextend:max:500M
+</programlisting>
+
+    <para>
+      <literal>InnoDB</literal> creates tablespace files in the MySQL
+      data directory by default. To specify a location explicitly, use
+      the <literal>innodb_data_home_dir</literal> option. For example,
+      to use two files named <filename>ibdata1</filename> and
+      <filename>ibdata2</filename> but create them in the
+      <filename>/ibdata</filename> directory, configure
+      <literal>InnoDB</literal> like this:
+    </para>
+
+<programlisting>
+[mysqld]
+innodb_data_home_dir = /ibdata
+innodb_data_file_path=ibdata1:50M;ibdata2:50M:autoextend
+</programlisting>
+
+    <note>
+      <para>
+        <literal>InnoDB</literal> does not create directories, so make
+        sure that the <filename>/ibdata</filename> directory exists
+        before you start the server. This is also true of any log file
+        directories that you configure. Use the Unix or DOS
+        <literal>mkdir</literal> command to create any necessary
+        directories.
+      </para>
+    </note>
+
+    <para>
+      <literal>InnoDB</literal> forms the directory path for each data
+      file by textually concatenating the value of
+      <literal>innodb_data_home_dir</literal> to the data file name,
+      adding a pathname separator (slash or backslash) between values if
+      necessary. If the <literal>innodb_data_home_dir</literal> option
+      is not mentioned in <filename>my.cnf</filename> at all, the
+      default value is the <quote>dot</quote> directory
+      <filename>./</filename>, which means the MySQL data directory.
+      (The MySQL server changes its current working directory to its
+      data directory when it begins executing.)
+    </para>
+
+    <para>
+      If you specify <literal>innodb_data_home_dir</literal> as an empty
+      string, you can specify absolute paths for the data files listed
+      in the <literal>innodb_data_file_path</literal> value. The
+      following example is equivalent to the preceding one:
+    </para>
+
+<programlisting>
+[mysqld]
+innodb_data_home_dir =
+innodb_data_file_path=/ibdata/ibdata1:50M;/ibdata/ibdata2:50M:autoextend
+</programlisting>
+
+    <para>
+      <emphasis role="bold">A simple <filename>my.cnf</filename>
+      example.</emphasis> Suppose that you have a computer with 128MB
+      RAM and one hard disk. The following example shows possible
+      configuration parameters in <filename>my.cnf</filename> or
+      <filename>my.ini</filename> for <literal>InnoDB</literal>.
The
+      example assumes the use of MySQL-Max 3.23.50 or later or MySQL
+      4.0.2 or later because it uses the <literal>autoextend</literal>
+      attribute. The example suits most users, both on Unix and Windows,
+      who do not want to distribute <literal>InnoDB</literal> data files
+      and log files onto several disks. It creates an auto-extending
+      data file <filename>ibdata1</filename> and two
+      <literal>InnoDB</literal> log files
+      <filename>ib_logfile0</filename> and
+      <filename>ib_logfile1</filename> in the MySQL data directory.
+      Also, the small archived <literal>InnoDB</literal> log file
+      <filename>ib_arch_log_0000000000</filename> that
+      <literal>InnoDB</literal> creates automatically ends up in the
+      data directory.
+    </para>
+
+<programlisting>
+[mysqld]
+# You can write your other MySQL server options here
+# ...
+# Data files must be able to hold your data and indexes.
+# Make sure that you have enough free disk space.
+innodb_data_file_path = ibdata1:10M:autoextend
+#
+# Set buffer pool size to 50-80% of your computer's memory
+set-variable = innodb_buffer_pool_size=70M
+set-variable = innodb_additional_mem_pool_size=10M
+#
+# Set the log file size to about 25% of the buffer pool size
+set-variable = innodb_log_file_size=20M
+set-variable = innodb_log_buffer_size=8M
+#
+innodb_flush_log_at_trx_commit=1
+</programlisting>
+
+    <para>
+      Make sure that the MySQL server has the proper access rights to
+      create files in the data directory. More generally, the server
+      must have access rights in any directory where it needs to create
+      data files or log files.
+    </para>
+
+    <para>
+      Note that data files must be less than 2GB in some filesystems.
+      The combined size of the log files must be less than 4GB. The
+      combined size of data files must be at least 10MB.
+    </para>
+
+    <para>
+      When you create an <literal>InnoDB</literal> tablespace for the
+      first time, it is best that you start the MySQL server from the
+      command prompt. <literal>InnoDB</literal> then prints the
+      information about the database creation to the screen, so you can
+      see what is happening. For example, on Windows, if
+      <command>mysqld</command> is located in <filename>C:\Program
+      Files\MySQL\MySQL Server &current-series;\bin</filename>, you can
+      start it like this:
+    </para>
+
+<programlisting>
+C:\&gt; <userinput>"C:\Program Files\MySQL\MySQL Server
&current-series;\bin\mysqld" --console</userinput>
+</programlisting>
+
+    <para>
+      If you do not send server output to the screen, check the server's
+      error log to see what <literal>InnoDB</literal> prints during the
+      startup process.
+    </para>
+
+    <para>
+      See <xref linkend="innodb-init"/>, for an example of what the
+      information displayed by <literal>InnoDB</literal> should look
+      like.
+    </para>
+
+    <para>
+      You can place <literal>InnoDB</literal> options in the
+      <literal>[mysqld]</literal> group of any option file that your
+      server reads when it starts. The locations for option files are
+      described in <xref linkend="option-files"/>.
+    </para>
+
+    <para>
+      If you installed MySQL on Windows using the installation and
+      configuration wizards, the option file will be the
+      <filename>my.ini</filename> file located in your MySQL
+      installation directory. See
+      <xref linkend="mysql-config-wizard-file-location"/>.
+    </para>
+
+    <para>
+      If your PC uses a boot loader where the <filename>C:</filename>
+      drive is not the boot drive, your only option is to use the
+      <filename>my.ini</filename> file in your Windows directory
+      (typically <filename>C:\WINDOWS</filename> or
+      <filename>C:\WINNT</filename>). You can use the
+      <literal>SET</literal> command at the command prompt in a console
+      window to print the value of <literal>WINDIR</literal>:
+    </para>
+
+<programlisting>
+C:\&gt; <userinput>SET WINDIR</userinput>
+windir=C:\WINDOWS
+</programlisting>
+
+    <para>
+      If you want to make sure that <command>mysqld</command> reads
+      options only from a specific file, you can use the
+      <option>--defaults-file</option> option as the first option on the
+      command line when starting the server:
+    </para>
+
+<programlisting>
+mysqld --defaults-file=<replaceable>your_path_to_my_cnf</replaceable>
+</programlisting>
+
+    <para>
+      <emphasis role="bold">An advanced <filename>my.cnf</filename>
+      example.</emphasis> Suppose that you have a Linux computer with
+      2GB RAM and three 60GB hard disks at directory paths
+      <filename>/</filename>, <filename>/dr2</filename> and
+      <filename>/dr3</filename>. The following example shows possible
+      configuration parameters in <filename>my.cnf</filename> for
+      <literal>InnoDB</literal>.
+    </para>
+
+<programlisting>
+[mysqld]
+# You can write your other MySQL server options here
+# ...
+innodb_data_home_dir =
+#
+# Data files must be able to hold your data and indexes
+innodb_data_file_path = /ibdata/ibdata1:2000M;/dr2/ibdata/ibdata2:2000M:autoextend
+#
+# Set buffer pool size to 50-80% of your computer's memory,
+# but make sure on Linux x86 total memory usage is &lt; 2GB
+set-variable = innodb_buffer_pool_size=1G
+set-variable = innodb_additional_mem_pool_size=20M
+innodb_log_group_home_dir = /dr3/iblogs
+#
+# innodb_log_arch_dir must be the same as innodb_log_group_home_dir
+# (starting from 4.0.6, you can omit it)
+innodb_log_arch_dir = /dr3/iblogs
+set-variable = innodb_log_files_in_group=2
+#
+# Set the log file size to about 25% of the buffer pool size
+set-variable = innodb_log_file_size=250M
+set-variable = innodb_log_buffer_size=8M
+#
+innodb_flush_log_at_trx_commit=1
+set-variable = innodb_lock_wait_timeout=50
+#
+# Uncomment the next lines if you want to use them
+#set-variable = innodb_thread_concurrency=5
+</programlisting>
+
+    <para>
+      In some cases, database performance improves if all data is not
+      placed on the same physical disk. Putting log files on a different
+      disk from data is very often beneficial for performance. The
+      example illustrates how to do this. It places the two data files
+      on different disks and places the log files on the third disk.
+      <literal>InnoDB</literal> fills the tablespace beginning with the
+      first data file. You can also use raw disk partitions (raw
+      devices) as <literal>InnoDB</literal> data files, which may speed
+      up I/O. See <xref linkend="innodb-raw-devices"/>.
+    </para>
+
+    <warning>
+      <para>
+        On 32-bit GNU/Linux x86, you must be careful not to set memory
+        usage too high. <literal>glibc</literal> may allow the process
+        heap to grow over thread stacks, which crashes your server. It
+        is a risk if the value of the following expression is close to
+        or exceeds 2GB:
+      </para>
+    </warning>
+
+<programlisting>
+innodb_buffer_pool_size
++ key_buffer_size
++ max_connections*(sort_buffer_size+read_buffer_size+binlog_cache_size)
++ max_connections*2MB
+</programlisting>
+
+    <para>
+      Each thread uses a stack (often 2MB, but only 256KB in MySQL AB
+      binaries) and in the worst case also uses
+      <literal>sort_buffer_size + read_buffer_size</literal> additional
+      memory.
+    </para>
+
+    <para>
+      In MySQL 4.1, by compiling MySQL yourself, you can use up to 64GB
+      of physical memory in 32-bit Windows. See the description for
+      <literal>innodb_buffer_pool_awe_mem_mb</literal> in
+      <xref linkend="innodb-parameters"/>.
+    </para>
+
+    <para>
+      <emphasis role="bold">How to tune other <command>mysqld</command>
+      server parameters?</emphasis> The following values are typical and
+      suit most users:
+    </para>
+
+<programlisting>
+[mysqld]
+skip-external-locking
+set-variable = max_connections=200
+set-variable = read_buffer_size=1M
+set-variable = sort_buffer_size=1M
+#
+# Set key_buffer to 5 - 50% of your RAM depending on how much
+# you use MyISAM tables, but keep key_buffer_size + InnoDB
+# buffer pool size &lt; 80% of your RAM
+set-variable = key_buffer_size=...
+</programlisting>
+
+    <section id="multiple-tablespaces">
+
+      <title>Using Per-Table Tablespaces</title>
+
+      <note>
+        <para>
+          There is a known bug in versions prior to 4.1.8 that manifests
+          itself if you specify <literal>innodb_file_per_table</literal>
+          in <filename>my.cnf</filename>. If you shut down
+          <command>mysqld</command>, then records may disappear from the
+          secondary indexes of a table. See Bug #7496 for more
+          information and workarounds. This is fixed in 4.1.9, but
+          another bug (Bug #8021) bit the Windows version in 4.1.9, and
+          in the Windows version of 4.1.9 you must put the line
+          <literal>innodb_flush_method=unbuffered</literal> to your
+          <filename>my.cnf</filename> or
<filename>my.ini</filename> to
+          get <command>mysqld</command> to work.
+        </para>
+      </note>
+
+      <para>
+        Starting from MySQL 4.1.1, you can store each
+        <literal>InnoDB</literal> table and its indexes in its own file.
+        This feature is called <quote>multiple tablespaces</quote>
+        because in effect each table has its own tablespace.
+      </para>
+
+      <para>
+        Using multiple tablespaces can be beneficial to users who want
+        to move specific tables to separate physical disks or who wish
+        to restore backups of single tables quickly without interrupting
+        the use of the remaining <literal>InnoDB</literal> tables.
+      </para>
+
+      <para>
+        If you need to downgrade to 4.0, you must make table dumps and
+        re-create the whole <literal>InnoDB</literal> tablespace. If you
+        have not created new <literal>InnoDB</literal> tables under
+        MySQL 4.1.1 or later, and need to downgrade quickly, you can
+        also do a direct downgrade to the MySQL 4.0.18 or later in the
+        4.0 series. Before doing the direct downgrade to 4.0.x, you have
+        to end all client connections to the <command>mysqld</command>
+        server that is to be downgraded, and let it run the purge and
+        insert buffer merge operations to completion, so that
+        <literal>SHOW INNODB STATUS</literal> shows the main thread in
+        the state <literal>waiting for server activity</literal>. Then
+        you can shut down <command>mysqld</command> and start 4.0.18 or
+        later in the 4.0 series.
+      </para>
+
+      <para>
+        You can enable multiple tablespaces by adding a line to the
+        <literal>[mysqld]</literal> section of
+        <filename>my.cnf</filename>:
+      </para>
+
+<programlisting>
+[mysqld]
+innodb_file_per_table
+</programlisting>
+
+      <para>
+        After restarting the server, <literal>InnoDB</literal> stores
+        each newly created table into its own file
+       
<filename><replaceable>tbl_name</replaceable>.ibd</filename> in
+        the database directory where the table belongs. This is similar
+        to what the <literal>MyISAM</literal> storage engine does, but
+        <literal>MyISAM</literal> divides the table into a data file
+       
<filename><replaceable>tbl_name</replaceable>.MYD</filename> and
+        the index file
+       
<filename><replaceable>tbl_name</replaceable>.MYI</filename>.
+        For <literal>InnoDB</literal>, the data and the indexes are
+        stored together in the <filename>.ibd</filename> file. The
+       
<filename><replaceable>tbl_name</replaceable>.frm</filename>
+        file is still created as usual.
+      </para>
+
+      <para>
+        If you remove the <literal>innodb_file_per_table</literal> line
+        from <filename>my.cnf</filename> and restart the server,
+        <literal>InnoDB</literal> creates tables inside the shared
+        tablespace files again.
+      </para>
+
+      <para>
+        <literal>innodb_file_per_table</literal> affects only table
+        creation, not access to existing tables. If you start the server
+        with this option, new tables are created using
+        <filename>.ibd</filename> files, but you can still access tables
+        that exist in the shared tablespace. If you remove the option
+        and restart the server, new tables are created in the shared
+        tablespace, but you can still access any tables that were
+        created using multiple tablespaces.
+      </para>
+
+      <note>
+        <para>
+          <literal>InnoDB</literal> always needs the shared tablespace
+          because it puts its internal data dictionary and undo logs
+          there. The <filename>.ibd</filename> files are not sufficient
+          for <literal>InnoDB</literal> to operate.
+        </para>
+      </note>
+
+      <note>
+        <para>
+          You cannot freely move <filename>.ibd</filename> files between
+          database directories as you can with <literal>MyISAM</literal>
+          table files. This is because the table definition that is
+          stored in the <literal>InnoDB</literal> shared tablespace
+          includes the database name, and because
+          <literal>InnoDB</literal> must preserve the consistency of
+          transaction IDs and log sequence numbers.
+        </para>
+      </note>
+
+      <para>
+        To move an <filename>.ibd</filename> file and the associated
+        table from one database to another, use a <literal>RENAME
+        TABLE</literal> statement:
+      </para>
+
+<programlisting>
+RENAME TABLE <replaceable>db1.tbl_name</replaceable> TO
<replaceable>db2.tbl_name</replaceable>;
+</programlisting>
+
+      <indexterm>
+        <primary>DISCARD TABLESPACE</primary>
+      </indexterm>
+
+      <indexterm>
+        <primary>IMPORT TABLESPACE</primary>
+      </indexterm>
+
+      <para>
+        If you have a <quote>clean</quote> backup of an
+        <filename>.ibd</filename> file, you can restore it to the MySQL
+        installation from which it originated as follows:
+      </para>
+
+      <orderedlist>
+
+        <listitem>
+          <para>
+            Issue this <literal>ALTER TABLE</literal> statement:
+          </para>
+
+<programlisting>
+ALTER TABLE <replaceable>tbl_name</replaceable> DISCARD TABLESPACE;
+</programlisting>
+
+          <caution>
+            <para>
+              This statement deletes the current
+              <filename>.ibd</filename> file.
+            </para>
+          </caution>
+        </listitem>
+
+        <listitem>
+          <para>
+            Put the backup <filename>.ibd</filename> file back in the
+            proper database directory.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            Issue this <literal>ALTER TABLE</literal> statement:
+          </para>
+
+<programlisting>
+ALTER TABLE <replaceable>tbl_name</replaceable> IMPORT TABLESPACE;
+</programlisting>
+        </listitem>
+
+      </orderedlist>
+
+      <para>
+        In this context, a <quote>clean</quote>
+        <filename>.ibd</filename> file backup means:
+      </para>
+
+      <itemizedlist>
+
+        <listitem>
+          <para>
+            There are no uncommitted modifications by transactions in
+            the <filename>.ibd</filename> file.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            There are no unmerged insert buffer entries in the
+            <filename>.ibd</filename> file.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            Purge has removed all delete-marked index records from the
+            <filename>.ibd</filename> file.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <command>mysqld</command> has flushed all modified pages of
+            the <filename>.ibd</filename> file from the buffer pool to
+            the file.
+          </para>
+        </listitem>
+
+      </itemizedlist>
+
+      <para>
+        You can make a clean backup <filename>.ibd</filename> file using
+        the following method:
+      </para>
+
+      <orderedlist>
+
+        <listitem>
+          <para>
+            Stop all activity from the <command>mysqld</command> server
+            and commit all transactions.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            Wait until <literal>SHOW INNODB STATUS</literal> shows that
+            there are no active transactions in the database, and the
+            main thread status of <literal>InnoDB</literal> is
+            <literal>Waiting for server activity</literal>. Then you can
+            make a copy of the <filename>.ibd</filename> file.
+          </para>
+        </listitem>
+
+      </orderedlist>
+
+      <para>
+        Another method for making a clean copy of an
+        <filename>.ibd</filename> file is to use the commercial
+        <command>InnoDB Hot Backup</command> tool:
+      </para>
+
+      <orderedlist>
+
+        <listitem>
+          <para>
+            Use <command>InnoDB Hot Backup</command> to back up the
+            <literal>InnoDB</literal> installation.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            Start a second <command>mysqld</command> server on the
+            backup and let it clean up the <filename>.ibd</filename>
+            files in the backup.
+          </para>
+        </listitem>
+
+      </orderedlist>
+
+    </section>
+
+    <section id="innodb-raw-devices">
+
+      <title>Using Raw Devices for the Shared Tablespace</title>
+
+      <para>
+        Starting from MySQL 3.23.41, you can use raw disk partitions as
+        data files in the shared tablespace. By using a raw disk, you
+        can perform non-buffered I/O on Windows and on some Unix systems
+        without filesystem overhead, which might improve performance.
+      </para>
+
+      <para>
+        When you create a new data file, you must put the keyword
+        <literal>newraw</literal> immediately after the data file size
+        in <literal>innodb_data_file_path</literal>. The partition must
+        be at least as large as the size that you specify. Note that 1MB
+        in <literal>InnoDB</literal> is 1024 &times; 1024 bytes, whereas
+        1MB in disk specifications usually means 1,000,000 bytes.
+      </para>
+
+<programlisting>
+[mysqld]
+innodb_data_home_dir=
+innodb_data_file_path=/dev/hdd1:3Gnewraw;/dev/hdd2:2Gnewraw
+</programlisting>
+
+      <para>
+        The next time you start the server, <literal>InnoDB</literal>
+        notices the <literal>newraw</literal> keyword and initializes
+        the new partition. However, do not create or change any
+        <literal>InnoDB</literal> tables yet. Otherwise, when you next
+        restart the server, <literal>InnoDB</literal> reinitializes the
+        partition and your changes are lost. (Starting from MySQL
+        3.23.44, as a safety measure <literal>InnoDB</literal> prevents
+        users from modifying data when any partition with
+        <literal>newraw</literal> is specified.)
+      </para>
+
+      <para>
+        After <literal>InnoDB</literal> has initialized the new
+        partition, stop the server, change <literal>newraw</literal> in
+        the data file specification to <literal>raw</literal>:
+      </para>
+
+<programlisting>
+[mysqld]
+innodb_data_home_dir=
+innodb_data_file_path=/dev/hdd1:5Graw;/dev/hdd2:2Graw
+</programlisting>
+
+      <para>
+        Then restart the server and <literal>InnoDB</literal> allows
+        changes to be made.
+      </para>
+
+      <para>
+        On Windows, starting from 4.1.1, you can allocate a disk
+        partition as a data file like this:
+      </para>
+
+<programlisting>
+[mysqld]
+innodb_data_home_dir=
+innodb_data_file_path=//./D::10Gnewraw
+</programlisting>
+
+      <para>
+        The <filename>//./</filename> corresponds to the Windows syntax
+        of <filename>\\.\</filename> for accessing physical drives.
+      </para>
+
+      <para>
+        When you use raw disk partitions, be sure that they have
+        permissions that allow read and write access by the account used
+        for running the MySQL server.
+      </para>
+
+    </section>
+
+  </section>
+
+  <section id="innodb-parameters">
+
+    <title><literal>InnoDB</literal> Startup Options and System
Variables</title>
+
+    <para>
+      This section describes the <literal>InnoDB</literal>-related
+      command options and system variables. System variables that take a
+      numeric value can be specified as
+     
<option>--<replaceable>var_name</replaceable>=<replaceable>value</replaceable></option>
+      on the command line or as
+     
<literal><replaceable>var_name</replaceable>=<replaceable>value</replaceable></literal>
+      in option files. Many of the system variables can be changed at
+      runtime (see <xref linkend="dynamic-system-variables"/>). (Before
+      MySQL 4.0.2, system variable values should be specified using
+      <option>--set-variable</option> syntax.) For more information on
+      specifying options and system variables, see
+      <xref linkend="program-options"/>.
+    </para>
+
+    <indexterm>
+      <primary>NFS</primary>
+      <secondary>InnoDB</secondary>
+    </indexterm>
+
+    <indexterm>
+      <primary>InnoDB</primary>
+      <secondary>NFS</secondary>
+    </indexterm>
+
+    <note>
+      <para>
+        It is not a good idea to configure <literal>InnoDB</literal> to
+        use datafiles or logfiles on NFS volumes. Otherwise, the files
+        might be locked by other processes and become unavailable for
+        use by MySQL.
+      </para>
+    </note>
+
+    <formalpara role="mnmas">
+
+      <title>MySQL Enterprise</title>
+
+      <para>
+        The MySQL Enterprise Monitor provides expert advice on InnoDB
+        start-up options and related system variables. For more
+        information see,
+        <ulink url="&base-url-enterprise;advisors.html"/>.
+      </para>
+
+    </formalpara>
+    
+    <para condition="dynamic:optvar:fullsummary" role="4.1:mysqld:section-innodb"/>
+
+    <para>
+      <literal>InnoDB</literal> command options:
+    </para>
+
+    <itemizedlist>
+
+      <listitem>
+        <para id="optvar_innodb">
+          <indexterm>
+            <primary>mysqld</primary>
+            <secondary>innodb option</secondary>
+          </indexterm>
+
+          <indexterm>
+            <primary>innodb option</primary>
+            <secondary>mysqld</secondary>
+          </indexterm>
+
+          <option>--innodb</option>
+        </para>
+
+        <para>
+          Enables the <literal>InnoDB</literal> storage engine, if the
+          server was compiled with <literal>InnoDB</literal> support.
+          Use <option>--skip-innodb</option> to disable
+          <literal>InnoDB</literal>.
+        </para>
+      </listitem>
+
+      <listitem>
+        <para id="optvar_innodb_status_file">
+          <indexterm>
+            <primary>mysqld</primary>
+            <secondary>innodb_status_file option</secondary>
+          </indexterm>
+
+          <indexterm>
+            <primary>innodb_status_file option</primary>
+            <secondary>mysqld</secondary>
+          </indexterm>
+
+          <option>--innodb_status_file</option>
+        </para>
+
+        <para>
+          Causes <literal>InnoDB</literal> to create a file named
+         
<filename><replaceable>&lt;datadir&gt;</replaceable>/innodb_status.<replaceable>&lt;pid&gt;</replaceable></filename>
+          in the MySQL data directory. <literal>InnoDB</literal>
+          periodically writes the output of <literal>SHOW ENGINE INNODB
+          STATUS</literal> to this file. This option is available as of
+          MySQL 4.0.21.
+        </para>
+      </listitem>
+
+    </itemizedlist>
+
+    <para>
+      <literal>InnoDB</literal> system variables:
+    </para>
+
+    <itemizedlist>
+
+      <listitem>
+        <para id="optvar_innodb_additional_mem_pool_size">
+          <literal>innodb_additional_mem_pool_size</literal>
+        </para>
+
+        <para>
+          The size in bytes of a memory pool <literal>InnoDB</literal>
+          uses to store data dictionary information and other internal
+          data structures. The more tables you have in your application,
+          the more memory you need to allocate here. If
+          <literal>InnoDB</literal> runs out of memory in this pool, it
+          starts to allocate memory from the operating system, and
+          writes warning messages to the MySQL error log. The default
+          value is 1MB.
+        </para>
+      </listitem>
+
+      <listitem>
+        <para id="optvar_innodb_autoextend_increment">
+          <literal>innodb_autoextend_increment</literal>
+        </para>
+
+        <para>
+          The increment size (in MB) for extending the size of an
+          auto-extending tablespace when it becomes full. The default
+          value is 8. This variable is available starting from MySQL
+          4.0.24 and 4.1.5. As of MySQL 4.0.24 and 4.1.6, it can be
+          changed at runtime as a global system variable.
+        </para>
+      </listitem>
+
+      <listitem>
+        <para id="optvar_innodb_buffer_pool_awe_mem_mb">
+          <literal>innodb_buffer_pool_awe_mem_mb</literal>
+        </para>
+
+        <para>
+          The size of the buffer pool (in MB), if it is placed in the
+          AWE memory. This is relevant only in 32-bit Windows. If your
+          32-bit Windows operating system supports more than 4GB memory,
+          using so-called <quote>Address Windowing Extensions,</quote>
+          you can allocate the <literal>InnoDB</literal> buffer pool
+          into the AWE physical memory using this variable. The maximum
+          possible value for this variable is 63000. If it is greater
+          than 0, <literal>innodb_buffer_pool_size</literal> is the
+          window in the 32-bit address space of
+          <command>mysqld</command> where
<literal>InnoDB</literal> maps
+          that AWE memory. A good value for
+          <literal>innodb_buffer_pool_size</literal> is 500MB. This
+          variable is available as of MySQL 4.1.0.
+        </para>
+
+        <para>
+          To take advantage of AWE memory, you will need to recompile
+          MySQL yourself. The current project settings needed for doing
+          this can be found in the
+          <filename>innobase/os/os0proj.c</filename> source file.
+        </para>
+      </listitem>
+
+      <listitem>
+        <para id="optvar_innodb_buffer_pool_size">
+          <literal>innodb_buffer_pool_size</literal>
+        </para>
+
+        <para>
+          The size in bytes of the memory buffer
+          <literal>InnoDB</literal> uses to cache data and indexes of
+          its tables. The larger you set this value, the less disk I/O
+          is needed to access data in tables. On a dedicated database
+          server, you may set this to up to 80% of the machine physical
+          memory size. However, do not set it too large because
+          competition for physical memory might cause paging in the
+          operating system.
+        </para>
+      </listitem>
+
+      <listitem>
+        <para id="optvar_innodb_data_file_path">
+          <literal>innodb_data_file_path</literal>
+        </para>
+
+        <para>
+          The paths to individual data files and their sizes. The full
+          directory path to each data file is acquired by concatenating
+          <literal>innodb_data_home_dir</literal> to each path specified
+          here. The file sizes are specified in MB or GB (1024MB) by
+          appending <literal>M</literal> or <literal>G</literal>
to the
+          size value. The sum of the sizes of the files must be at least
+          10MB. On some operating systems, files must be less than 2GB.
+          If you do not specify
+          <literal>innodb_data_file_path</literal>, the default behavior
+          starting from 4.0 is to create a single 10MB auto-extending
+          data file named <filename>ibdata1</filename>. Starting from
+          3.23.44, you can set the file size larger than 4GB on those
+          operating systems that support big files. You can also use raw
+          disk partitions as data files. See
+          <xref linkend="innodb-raw-devices"/>.
+        </para>
+      </listitem>
+
+      <listitem>
+        <para id="optvar_innodb_data_home_dir">
+          <literal>innodb_data_home_dir</literal>
+        </para>
+
+        <para>
+          The common part of the directory path for all
+          <literal>InnoDB</literal> data files. If you do not set this
+          value, the default is the MySQL data directory. You can
+          specify this also as an empty string, in which case you can
+          use absolute file paths in
+          <literal>innodb_data_file_path</literal>.
+        </para>
+      </listitem>
+
+      <listitem>
+        <para id="optvar_innodb_fast_shutdown">
+          <literal>innodb_fast_shutdown</literal>
+        </para>
+
+        <para>
+          If you set this variable to 0, <literal>InnoDB</literal> does
+          a full purge and an insert buffer merge before a shutdown.
+          These operations can take minutes, or even hours in extreme
+          cases. If you set this variable to 1,
+          <literal>InnoDB</literal> skips these operations at shutdown.
+          The default value is 1 starting from 3.23.50.
+        </para>
+      </listitem>
+
+      <listitem>
+        <para id="optvar_innodb_file_io_threads">
+          <literal>innodb_file_io_threads</literal>
+        </para>
+
+        <para>
+          The number of file I/O threads in <literal>InnoDB</literal>.
+          Normally, this should be left at the default value of 4, but
+          disk I/O on Windows may benefit from a larger number. On Unix,
+          increasing the number has no effect; <literal>InnoDB</literal>
+          always uses the default value. This variable is available as
+          of MySQL 3.23.37.
+        </para>
+      </listitem>
+
+      <listitem>
+        <para id="optvar_innodb_file_per_table">
+          <literal>innodb_file_per_table</literal>
+        </para>
+
+        <note>
+          <para>
+            A bug in versions &lt;= 4.1.8 if you specify
+            <literal>innodb_file_per_table</literal> in
+            <filename>my.cnf</filename>! If you shut down
+            <command>mysqld</command>, then records may disappear from
+            the secondary indexes of a table. See Bug #7496 for more
+            information and workarounds. This is fixed in 4.1.9, but
+            another bug (Bug #8021) bit the Windows version in 4.1.9,
+            and in the Windows version of 4.1.9 you must put the line
+            <literal>innodb_flush_method=unbuffered</literal> in your
+            <filename>my.cnf</filename> or
<filename>my.ini</filename>
+            to get <command>mysqld</command> to work.
+          </para>
+        </note>
+
+        <para>
+          If this variable is enabled, <literal>InnoDB</literal> creates
+          each new table using its own <filename>.ibd</filename> file
+          for storing data and indexes, rather than in the shared
+          tablespace. The default is to create tables in the shared
+          tablespace. See <xref linkend="multiple-tablespaces"/>. This
+          variable is available as of MySQL 4.1.1.
+        </para>
+      </listitem>
+
+      <listitem>
+        <para id="optvar_innodb_flush_log_at_trx_commit">
+          <literal>innodb_flush_log_at_trx_commit</literal>
+        </para>
+
+        <para>
+          When <literal>innodb_flush_log_at_trx_commit</literal> is set
+          to 0, the log buffer is written out to the log file once per
+          second and the flush to disk operation is performed on the log
+          file, but nothing is done at a transaction commit. When this
+          value is 1 (the default), the log buffer is written out to the
+          log file at each transaction commit and the flush to disk
+          operation is performed on the log file. When set to 2, the log
+          buffer is written out to the file at each commit, but the
+          flush to disk operation is not performed on it. However, the
+          flushing on the log file takes place once per second also when
+          the value is 2. Note that the once-per-second flushing is not
+          100% guaranteed to happen every second, due to process
+          scheduling issues.
+        </para>
+
+        <para>
+          The default value of this variable is 1, which is the value
+          that is required for ACID compliance. You can achieve better
+          performance by setting the value different from 1, but then
+          you can lose at most one second worth of transactions in a
+          crash. If you set the value to 0, then any
+          <command>mysqld</command> process crash can erase the last
+          second of transactions. If you set the value to 2, then only
+          an operating system crash or a power outage can erase the last
+          second of transactions. However, <literal>InnoDB</literal>'s
+          crash recovery is not affected and thus crash recovery does
+          work regardless of the value. Note that many operating systems
+          and some disk hardware fool the flush-to-disk operation. They
+          may tell <command>mysqld</command> that the flush has taken
+          place, even though it has not. Then the durability of
+          transactions is not guaranteed even with the setting 1, and in
+          the worst case a power outage can even corrupt the
+          <literal>InnoDB</literal> database. Using a battery-backed
+          disk cache in the SCSI disk controller or in the disk itself
+          speeds up file flushes, and makes the operation safer. You can
+          also try using the Unix command <command>hdparm</command> to
+          disable the caching of disk writes in hardware caches, or use
+          some other command specific to the hardware vendor. The
+          default value of this variable is 1 (prior to MySQL 4.0.13,
+          the default is 0).
+        </para>
+
+        <para>
+          Note: For the greatest possible durability and consistency in
+          a replication setup using <literal>InnoDB</literal> with
+          transactions, you should use
+          <literal>innodb_flush_log_at_trx_commit=1</literal>,
+          <literal>sync_binlog=1</literal>, and
+          <literal>innodb_safe_binlog</literal> in your master server
+          <filename>my.cnf</filename> file.
+        </para>
+      </listitem>
+
+      <listitem>
+        <para id="optvar_innodb_flush_method">
+          <literal>innodb_flush_method</literal>
+        </para>
+
+        <para>
+          If set to <literal>fdatasync</literal> (the default),
+          <literal>InnoDB</literal> uses
<literal>fsync()</literal> to
+          flush both the data and log files. If set to
+          <literal>O_DSYNC</literal>, <literal>InnoDB</literal>
uses
+          <literal>O_SYNC</literal> to open and flush the log files, but
+          uses <literal>fsync()</literal> to flush the data files. If
+          <literal>O_DIRECT</literal> is specified (available on some
+          GNU/Linux versions starting from MySQL 4.0.14),
+          <literal>InnoDB</literal> uses
<literal>O_DIRECT</literal> to
+          open the data files, and uses <literal>fsync()</literal> to
+          flush both the data and log files. Note that starting from
+          MySQL 3.23.41, <literal>InnoDB</literal> uses
+          <literal>fsync()</literal> instead of
+          <literal>fdatasync()</literal>, and it does not use
+          <literal>O_DSYNC</literal> by default because there have been
+          problems with it on many varieties of Unix. This variable is
+          relevant only for Unix. On Windows, the flush method is always
+          <literal>async_unbuffered</literal> and cannot be changed.
+          This variable is available as of MySQL 3.23.40.
+        </para>
+
+        <para>
+          Different values of this variable can have a marked effect on
+          <literal>InnoDB performance</literal>. For example, on some
+          systems where <literal>InnoDB</literal> data and log files are
+          located on a SAN, it has been found that setting
+          <literal>innodb_flush_method</literal> to
+          <literal>O_DIRECT</literal> can degrade performance of simple
+          <literal>SELECT</literal> statements by a factor of three.
+        </para>
+      </listitem>
+
+      <listitem>
+        <para id="optvar_innodb_force_recovery">
+          <literal>innodb_force_recovery</literal>
+        </para>
+
+        <para>
+          The crash recovery mode.
+        </para>
+
+        <warning>
+          <para>
+            This variable should be set greater than 0 only in an
+            emergency situation when you want to dump your tables from a
+            corrupt database! Possible values are from 1 to 6. The
+            meanings of these values are described in
+            <xref linkend="forcing-recovery"/>. As a safety measure,
+            <literal>InnoDB</literal> prevents any changes to its data
+            when this variable is greater than 0. This variable is
+            available starting from MySQL 3.23.44.
+          </para>
+        </warning>
+      </listitem>
+
+      <listitem>
+        <para id="optvar_innodb_lock_wait_timeout">
+          <literal>innodb_lock_wait_timeout</literal>
+        </para>
+
+        <para>
+          The timeout in seconds an <literal>InnoDB</literal>
+          transaction may wait for a lock before being rolled back.
+          <literal>InnoDB</literal> automatically detects transaction
+          deadlocks in its own lock table and rolls back the
+          transaction. Beginning with MySQL 4.0.20 and 4.1.2,
+          <literal>InnoDB</literal> notices locks set using the
+          <literal>LOCK TABLES</literal> statement. Before that, if you
+          use the <literal>LOCK TABLES</literal> statement, or other
+          transaction-safe storage engines than
+          <literal>InnoDB</literal> in the same transaction, a deadlock
+          may arise that <literal>InnoDB</literal> cannot notice. In
+          cases like this, the timeout is useful to resolve the
+          situation. The default is 50 seconds.
+        </para>
+      </listitem>
+
+      <listitem>
+        <para id="optvar_innodb_locks_unsafe_for_binlog">
+          <literal>innodb_locks_unsafe_for_binlog</literal>
+        </para>
+
+        <para>
+          <indexterm>
+            <primary>next-key lock</primary>
+          </indexterm>
+
+          <indexterm>
+            <primary>gap lock</primary>
+          </indexterm>
+
+          This variable controls next-key locking in
+          <literal>InnoDB</literal> searches and index scans. By
+          default, this variable is 0 (disabled), which means that
+          next-key locking is enabled.
+        </para>
+
+        <para>
+          Normally, <literal>InnoDB</literal> uses an algorithm called
+          <firstterm>next-key locking</firstterm>.
+          <literal>InnoDB</literal> performs row-level locking in such a
+          way that when it searches or scans a table index, it sets
+          shared or exclusive locks on any index records it encounters.
+          Thus, the row-level locks are actually index record locks. The
+          locks that <literal>InnoDB</literal> sets on index records
+          also affect the <quote>gap</quote> preceding that index
+          record. If a user has a shared or exclusive lock on record
+          <emphasis>R</emphasis> in an index, another user cannot insert
+          a new index record immediately before <emphasis>R</emphasis>
+          in the order of the index. Enabling this variable causes
+          <literal>InnoDB</literal> not to use next-key locking in
+          searches or index scans. Next-key locking is still used to
+          ensure foreign key constraints and duplicate key checking.
+          Note that enabling this variable may cause phantom problems:
+          Suppose that you want to read and lock all children from the
+          <literal>child</literal> table with an identifier value larger
+          than 100, with the intention of updating some column in the
+          selected rows later:
+        </para>
+
+<programlisting>
+SELECT * FROM child WHERE id &gt; 100 FOR UPDATE;
+</programlisting>
+
+        <para>
+          Suppose that there is an index on the <literal>id</literal>
+          column. The query scans that index starting from the first
+          record where <literal>id</literal> is larger than 100. If the
+          locks set on the index records do not lock out inserts made in
+          the gaps, another client can insert a new row into the table.
+          If you execute the same <literal>SELECT</literal> within the
+          same transaction, you see a new row in the result set returned
+          by the query. This also means that if new items are added to
+          the database, <literal>InnoDB</literal> does not guarantee
+          serializability Therefore, if this variable is enabled
+          <literal>InnoDB</literal> guarantees at most isolation level
+          <literal>READ COMMITTED</literal>. (Conflict serializability
+          is still guaranteed.) This variable is available as of MySQL
+          4.1.4.
+        </para>
+      </listitem>
+
+      <listitem>
+        <para id="optvar_innodb_log_arch_dir">
+          <literal>innodb_log_arch_dir</literal>
+        </para>
+
+        <para>
+          The directory where fully written log files would be archived
+          if we used log archiving. The value of this variable should
+          currently be set the same as
+          <literal>innodb_log_group_home_dir</literal>. Starting from
+          MySQL 4.0.6, there is no need to set this variable.
+        </para>
+      </listitem>
+
+      <listitem>
+        <para id="optvar_innodb_log_archive">
+          <literal>innodb_log_archive</literal>
+        </para>
+
+        <para>
+          Whether to log <literal>InnoDB</literal> archive files. This
+          variable is unused. Recovery from a backup is done by MySQL
+          using its own log files, so there is no need to archive
+          <literal>InnoDB</literal> log files. The default for this
+          variable is 0.
+        </para>
+      </listitem>
+
+      <listitem>
+        <para id="optvar_innodb_log_buffer_size">
+          <literal>innodb_log_buffer_size</literal>
+        </para>
+
+        <para>
+          The size in bytes of the buffer that <literal>InnoDB</literal>
+          uses to write to the log files on disk. Sensible values range
+          from 1MB to 8MB. The default is 1MB. A large log buffer allows
+          large transactions to run without a need to write the log to
+          disk before the transactions commit. Thus, if you have big
+          transactions, making the log buffer larger saves disk I/O.
+        </para>
+      </listitem>
+
+      <listitem>
+        <para id="optvar_innodb_log_file_size">
+          <literal>innodb_log_file_size</literal>
+        </para>
+
+        <para>
+          The size in bytes of each log file in a log group. The
+          combined size of log files must be less than 4GB on 32-bit
+          computers. The default is 5MB. Sensible values range from 1MB
+          to 1/<replaceable>N</replaceable>-th of the size of the buffer
+          pool, where <replaceable>N</replaceable> is the number of log
+          files in the group. The larger the value, the less checkpoint
+          flush activity is needed in the buffer pool, saving disk I/O.
+          But larger log files also mean that recovery is slower in case
+          of a crash.
+        </para>
+      </listitem>
+
+      <listitem>
+        <para id="optvar_innodb_log_files_in_group">
+          <literal>innodb_log_files_in_group</literal>
+        </para>
+
+        <para>
+          The number of log files in the log group.
+          <literal>InnoDB</literal> writes to the files in a circular
+          fashion. The default (and recommended) is 2.
+        </para>
+      </listitem>
+
+      <listitem>
+        <para id="optvar_innodb_log_group_home_dir">
+          <literal>innodb_log_group_home_dir</literal>
+        </para>
+
+        <para>
+          The directory path to the <literal>InnoDB</literal> log files.
+          It must have the same value as
+          <literal>innodb_log_arch_dir</literal>. If you do not specify
+          any <literal>InnoDB</literal> log variables, the default is to
+          create two 5MB files names <filename>ib_logfile0</filename>
+          and <filename>ib_logfile1</filename> in the MySQL data
+          directory.
+        </para>
+      </listitem>
+
+      <listitem>
+        <para id="optvar_innodb_max_dirty_pages_pct">
+          <literal>innodb_max_dirty_pages_pct</literal>
+        </para>
+
+        <para>
+          This is an integer in the range from 0 to 100. The default is
+          90. The main thread in <literal>InnoDB</literal> tries to
+          write pages from the buffer pool so that the percentage of
+          dirty (not yet written) pages will not exceed this value.
+          Available starting from 4.0.13 and 4.1.1.
+        </para>
+      </listitem>
+
+      <listitem>
+        <para id="optvar_innodb_max_purge_lag">
+          <literal>innodb_max_purge_lag</literal>
+        </para>
+
+        <para>
+          This variable controls how to delay <literal>INSERT</literal>,
+          <literal>UPDATE</literal> and <literal>DELETE</literal>
+          operations when the purge operations are lagging (see
+          <xref linkend="innodb-multi-versioning"/>). The default value
+          of this variable is 0, meaning that there are no delays.
+          <literal>innodb_max_purge_lag</literal> is available as of
+          MySQL 4.0.22 and 4.1.6.
+        </para>
+
+        <para>
+          The <literal>InnoDB</literal> transaction system maintains a
+          list of transactions that have delete-marked index records by
+          <literal>UPDATE</literal> or <literal>DELETE</literal>
+          operations. Let the length of this list be
+          <replaceable>purge_lag</replaceable>. When
+          <replaceable>purge_lag</replaceable> exceeds
+          <literal>innodb_max_purge_lag</literal>, each
+          <literal>INSERT</literal>, <literal>UPDATE</literal>
and
+          <literal>DELETE</literal> operation is delayed by
+         
((<replaceable>purge_lag</replaceable>/<literal>innodb_max_purge_lag</literal>)&times;10)&minus;5
+          milliseconds. The delay is computed in the beginning of a
+          purge batch, every ten seconds. The operations are not delayed
+          if purge cannot run because of an old consistent read view
+          that could see the rows to be purged.
+        </para>
+
+        <para>
+          A typical setting for a problematic workload might be 1
+          million, assuming that our transactions are small, only 100
+          bytes in size, and we can allow 100MB of unpurged rows in our
+          tables.
+        </para>
+      </listitem>
+
+      <listitem>
+        <para id="optvar_innodb_mirrored_log_groups">
+          <literal>innodb_mirrored_log_groups</literal>
+        </para>
+
+        <para>
+          The number of identical copies of log groups to keep for the
+          database. Currently, this should be set to 1.
+        </para>
+      </listitem>
+
+      <listitem>
+        <para id="optvar_innodb_open_files">
+          <literal>innodb_open_files</literal>
+        </para>
+
+        <para>
+          This variable is relevant only if you use multiple tablespaces
+          in <literal>InnoDB</literal>. It specifies the maximum number
+          of <filename>.ibd</filename> files that
+          <literal>InnoDB</literal> can keep open at one time. The
+          minimum value is 10. The default is 300. This variable is
+          available as of MySQL 4.1.1.
+        </para>
+
+        <para>
+          The file descriptors used for <filename>.ibd</filename> files
+          are for <literal>InnoDB</literal> only. They are independent
+          of those specified by the <option>--open-files-limit</option>
+          server option, and do not affect the operation of the table
+          cache.
+        </para>
+      </listitem>
+
+      <listitem>
+        <para id="optvar_innodb_safe_binlog">
+          <literal>innodb_safe_binlog</literal>
+        </para>
+
+        <para>
+          Adds consistency guarantees between the content of
+          <literal>InnoDB</literal> tables and the binary log. See
+          <xref linkend="binary-log"/>.
+        </para>
+      </listitem>
+
+      <listitem>
+        <para id="optvar_innodb_table_locks">
+          <literal>innodb_table_locks</literal>
+        </para>
+
+        <para>
+          Starting from MySQL 4.0.20, and 4.1.2,
+          <literal>InnoDB</literal> honors <literal>LOCK
+          TABLES</literal>. If <literal>AUTOCOMMIT=0</literal>,
+          <literal>InnoDB</literal> honors <literal>LOCK
+          TABLES</literal>; MySQL does not return from <literal>LOCK
+          TABLE .. WRITE</literal> until all other threads have released
+          all their locks to the table. The default value of
+          <literal>innodb_table_locks</literal> is 1, which means that
+          <literal>LOCK TABLES</literal> causes InnoDB to lock a table
+          internally if <literal>AUTOCOMMIT=0</literal>.
+        </para>
+      </listitem>
+
+      <listitem>
+        <para id="optvar_innodb_thread_concurrency">
+          <literal>innodb_thread_concurrency</literal>
+        </para>
+
+        <para>
+          <literal>InnoDB</literal> tries to keep the number of
+          operating system threads concurrently inside
+          <literal>InnoDB</literal> less than or equal to the limit
+          given by this variable. The default value is 8. If you have
+          low performance and <literal>SHOW INNODB STATUS</literal>
+          reveals many threads waiting for semaphores, you may have
+          thread thrashing and should try setting this variable lower or
+          higher. If you have a computer with many processors and disks,
+          you can try setting the value higher to better utilize the
+          resources of your computer. A recommended value is the sum of
+          the number of processors and disks your system has. A value of
+          500 or greater disables the concurrency checking. This
+          variable is available starting from MySQL 3.23.44 and 4.0.1.
+        </para>
+      </listitem>
+
+      <listitem>
+        <para>
+          <literal>sync_binlog</literal>
+        </para>
+
+        <para>
+          If the value of this variable is positive, the MySQL server
+          synchronizes its binary log to disk
+          (<literal>fdatasync()</literal>) after every
+          <literal>sync_binlog</literal> writes to this binary log. Note
+          that there is one write to the binary log per statement if in
+          autocommit mode, and otherwise one write per transaction. The
+          default value is 0 which does no sync'ing to disk. A value of
+          1 is the safest choice, because in case of crash you lose at
+          most one statement/transaction from the binary log; but it is
+          also the slowest choice (unless the disk has a battery-backed
+          cache, which makes sync'ing very fast). This variable was
+          added in MySQL 4.1.3.
+        </para>
+      </listitem>
+
+    </itemizedlist>
+
+  </section>
+
+  <section id="innodb-init">
+
+    <title>Creating the <literal>InnoDB</literal>
Tablespace</title>
+
+    <para>
+      Suppose that you have installed MySQL and have edited your option
+      file so that it contains the necessary <literal>InnoDB</literal>
+      configuration parameters. Before starting MySQL, you should verify
+      that the directories you have specified for
+      <literal>InnoDB</literal> data files and log files exist and that
+      the MySQL server has access rights to those directories.
+      <literal>InnoDB</literal> does not create directories, only files.
+      Check also that you have enough disk space for the data and log
+      files.
+    </para>
+
+    <para>
+      It is best to run the MySQL server <command>mysqld</command> from
+      the command prompt when you first start the server with
+      <literal>InnoDB</literal> enabled, not from the
+      <command>mysqld_safe</command> wrapper or as a Windows service.
+      When you run from a command prompt you see what
+      <command>mysqld</command> prints and what is happening. On Unix,
+      just invoke <command>mysqld</command>. On Windows, use the
+      <option>--console</option> option.
+    </para>
+
+    <para>
+      When you start the MySQL server after initially configuring
+      <literal>InnoDB</literal> in your option file,
+      <literal>InnoDB</literal> creates your data files and log files,
+      and prints something like this:
+    </para>
+
+<programlisting>
+InnoDB: The first specified datafile /home/heikki/data/ibdata1
+did not exist:
+InnoDB: a new database to be created!
+InnoDB: Setting file /home/heikki/data/ibdata1 size to 134217728
+InnoDB: Database physically writes the file full: wait...
+InnoDB: datafile /home/heikki/data/ibdata2 did not exist:
+new to be created
+InnoDB: Setting file /home/heikki/data/ibdata2 size to 262144000
+InnoDB: Database physically writes the file full: wait...
+InnoDB: Log file /home/heikki/data/logs/ib_logfile0 did not exist:
+new to be created
+InnoDB: Setting log file /home/heikki/data/logs/ib_logfile0 size
+to 5242880
+InnoDB: Log file /home/heikki/data/logs/ib_logfile1 did not exist:
+new to be created
+InnoDB: Setting log file /home/heikki/data/logs/ib_logfile1 size
+to 5242880
+InnoDB: Doublewrite buffer not found: creating new
+InnoDB: Doublewrite buffer created
+InnoDB: Creating foreign key constraint system tables
+InnoDB: Foreign key constraint system tables created
+InnoDB: Started
+mysqld: ready for connections
+</programlisting>
+
+    <para>
+      At this point <literal>InnoDB</literal> has initialized its
+      tablespace and log files. You can connect to the MySQL server with
+      the usual MySQL client programs like <command>mysql</command>.
+      When you shut down the MySQL server with <command>mysqladmin
+      shutdown</command>, the output is like this:
+    </para>
+
+<programlisting>
+010321 18:33:34  mysqld: Normal shutdown
+010321 18:33:34  mysqld: Shutdown Complete
+InnoDB: Starting shutdown...
+InnoDB: Shutdown completed
+</programlisting>
+
+    <para>
+      You can look at the data file and log directories and you see the
+      files created there. The log directory also contains a small file
+      named <filename>ib_arch_log_0000000000</filename>. That file
+      resulted from the database creation, after which
+      <literal>InnoDB</literal> switched off log archiving. When MySQL
+      is started again, the data files and log files have been created
+      already, so the output is much briefer:
+    </para>
+
+<programlisting>
+InnoDB: Started
+mysqld: ready for connections
+</programlisting>
+
+    <para>
+      Starting from MySQL 4.1.1, you can add the option
+      <literal>innodb_file_per_table</literal> to
+      <filename>my.cnf</filename> to make
<literal>InnoDB</literal>
+      store each table to its own <filename>.ibd</filename> file in the
+      same MySQL database directory where the <filename>.frm</filename>
+      file is created. See <xref linkend="multiple-tablespaces"/>.
+    </para>
+
+    <section id="error-creating-innodb">
+
+      <title>Dealing with <literal>InnoDB</literal> Initialization
Problems</title>
+
+      <para>
+        If <literal>InnoDB</literal> prints an operating system error
+        during a file operation, usually the problem has one of the
+        following causes:
+      </para>
+
+      <itemizedlist>
+
+        <listitem>
+          <para>
+            You did not create the <literal>InnoDB</literal> data file
+            directory or the <literal>InnoDB</literal> log directory.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <command>mysqld</command> does not have access rights to
+            create files in those directories.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <command>mysqld</command> cannot read the proper
+            <filename>my.cnf</filename> or
<filename>my.ini</filename>
+            option file, and consequently does not see the options that
+            you specified.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            The disk is full or a disk quota is exceeded.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            You have created a subdirectory whose name is equal to a
+            data file that you specified, so the name cannot be used as
+            a filename.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            There is a syntax error in the
+            <literal>innodb_data_home_dir</literal> or
+            <literal>innodb_data_file_path</literal> value.
+          </para>
+        </listitem>
+
+      </itemizedlist>
+
+      <para>
+        If something goes wrong when <literal>InnoDB</literal> attempts
+        to initialize its tablespace or its log files, you should delete
+        all files created by <literal>InnoDB</literal>. This means all
+        <filename>ibdata</filename> files and all
+        <filename>ib_logfile</filename> files. In case you have already
+        created some <literal>InnoDB</literal> tables, delete the
+        corresponding <filename>.frm</filename> files for these tables
+        (and any <filename>.ibd</filename> files if you are using
+        multiple tablespaces) from the MySQL database directories as
+        well. Then you can try the <literal>InnoDB</literal> database
+        creation again. It is best to start the MySQL server from a
+        command prompt so that you see what is happening.
+      </para>
+
+    </section>
+
+  </section>
+
+  <section id="using-innodb-tables">
+
+    <title>Creating and Using <literal>InnoDB</literal>
Tables</title>
+
+    <para>
+      To create an <literal>InnoDB</literal> table, specify an
+      <literal>ENGINE = InnoDB</literal> option in the <literal>CREATE
+      TABLE</literal> statement:
+    </para>
+
+<programlisting>
+CREATE TABLE customers (a INT, b CHAR (20), INDEX (a)) ENGINE=InnoDB;
+</programlisting>
+
+    <para>
+      The older term <literal>TYPE</literal> is supported as a synonym
+      for <literal>ENGINE</literal> for backward compatibility, but
+      <literal>ENGINE</literal> is the preferred term and
+      <literal>TYPE</literal> is deprecated.
+    </para>
+
+    <para>
+      The statement creates a table and an index on column
+      <literal>a</literal> in the <literal>InnoDB</literal>
tablespace
+      that consists of the data files that you specified in
+      <filename>my.cnf</filename>. In addition, MySQL creates a file
+      <filename>customers.frm</filename> in the
+      <filename>test</filename> directory under the MySQL database
+      directory. Internally, <literal>InnoDB</literal> adds an entry for
+      the table to its own data dictionary. The entry includes the
+      database name. For example, if <literal>test</literal> is the
+      database in which the <literal>customers</literal> table is
+      created, the entry is for <literal>'test/customers'</literal>.
+      This means you can create a table of the same name
+      <literal>customers</literal> in some other database, and the table
+      names do not collide inside <literal>InnoDB</literal>.
+    </para>
+
+    <para>
+      You can query the amount of free space in the
+      <literal>InnoDB</literal> tablespace by issuing a <literal>SHOW
+      TABLE STATUS</literal> statement for any
<literal>InnoDB</literal>
+      table. The amount of free space in the tablespace appears in the
+      <literal>Comment</literal> section in the output of <literal>SHOW
+      TABLE STATUS</literal>. For example:
+    </para>
+
+<programlisting>
+SHOW TABLE STATUS FROM test LIKE 'customers'
+</programlisting>
+
+    <para>
+      Note that the statistics <literal>SHOW</literal> displays for
+      <literal>InnoDB</literal> tables are only approximate. They are
+      used in SQL optimization. Table and index reserved sizes in bytes
+      are accurate, though.
+    </para>
+
+    <section id="innodb-transactions-with-different-apis">
+
+      <title>How to Use Transactions in <literal>InnoDB</literal> with
Different APIs</title>
+
+      <para>
+        By default, each client that connects to the MySQL server begins
+        with autocommit mode enabled, which automatically commits every
+        SQL statement as you execute it. To use multiple-statement
+        transactions, you can switch autocommit off with the SQL
+        statement <literal>SET AUTOCOMMIT = 0</literal> and use
+        <literal>COMMIT</literal> and <literal>ROLLBACK</literal>
to
+        commit or roll back your transaction. If you want to leave
+        autocommit on, you can enclose your transactions within
+        <literal>START TRANSACTION</literal> and either
+        <literal>COMMIT</literal> or <literal>ROLLBACK</literal>.
Before
+        MySQL 4.0.11, you have to use the keyword
+        <literal>BEGIN</literal> instead of <literal>START
+        TRANSACTION</literal>. The following example shows two
+        transactions. The first is committed and the second is rolled
+        back.
+      </para>
+
+<programlisting>
+shell&gt; <userinput>mysql test</userinput>
+
+mysql&gt; <userinput>CREATE TABLE CUSTOMER (A INT, B CHAR (20), INDEX
(A))</userinput>
+    -&gt; <userinput>TYPE=InnoDB;</userinput>
+Query OK, 0 rows affected (0.00 sec)
+mysql&gt; <userinput>BEGIN;</userinput>
+Query OK, 0 rows affected (0.00 sec)
+mysql&gt; <userinput>INSERT INTO CUSTOMER VALUES (10,
'Heikki');</userinput>
+Query OK, 1 row affected (0.00 sec)
+mysql&gt; <userinput>COMMIT;</userinput>
+Query OK, 0 rows affected (0.00 sec)
+mysql&gt; <userinput>SET AUTOCOMMIT=0;</userinput>
+Query OK, 0 rows affected (0.00 sec)
+mysql&gt; <userinput>INSERT INTO CUSTOMER VALUES (15,
'John');</userinput>
+Query OK, 1 row affected (0.00 sec)
+mysql&gt; <userinput>ROLLBACK;</userinput>
+Query OK, 0 rows affected (0.00 sec)
+mysql&gt; <userinput>SELECT * FROM CUSTOMER;</userinput>
++------+--------+
+| A    | B      |
++------+--------+
+|   10 | Heikki |
++------+--------+
+1 row in set (0.00 sec)
+mysql&gt;
+</programlisting>
+
+      <para>
+        In APIs such as PHP, Perl DBI, JDBC, ODBC, or the standard C
+        call interface of MySQL, you can send transaction control
+        statements such as <literal>COMMIT</literal> to the MySQL server
+        as strings just like any other SQL statements such as
+        <literal>SELECT</literal> or <literal>INSERT</literal>.
Some
+        APIs also offer separate special transaction commit and rollback
+        functions or methods.
+      </para>
+
+    </section>
+
+    <section id="converting-tables-to-innodb">
+
+      <title>Converting <literal>MyISAM</literal> Tables to
<literal>InnoDB</literal></title>
+
+      <para>
+        Important: Do not convert MySQL system tables in the
+        <literal>mysql</literal> database (such as
+        <literal>user</literal> or <literal>host</literal>) to
the
+        <literal>InnoDB</literal> type. This is an unsupported
+        operation. The system tables must always be of the
+        <literal>MyISAM</literal> type.
+      </para>
+
+      <para>
+        If you want all your (non-system) tables to be created as
+        <literal>InnoDB</literal> tables, you can, starting from the
+        MySQL 3.23.43, add the line
+        <literal>default-table-type=innodb</literal> to the
+        <literal>[mysqld]</literal> section of your server option file.
+      </para>
+
+      <para>
+        <literal>InnoDB</literal> does not have a special optimization
+        for separate index creation the way the
+        <literal>MyISAM</literal> storage engine does. Therefore, it
+        does not pay to export and import the table and create indexes
+        afterward. The fastest way to alter a table to
+        <literal>InnoDB</literal> is to do the inserts directly to an
+        <literal>InnoDB</literal> table. That is, use <literal>ALTER
+        TABLE ... TYPE=INNODB</literal>, or create an empty
+        <literal>InnoDB</literal> table with identical definitions and
+        insert the rows with <literal>INSERT INTO ... SELECT * FROM
+        ...</literal>.
+      </para>
+
+      <para>
+        If you have <literal>UNIQUE</literal> constraints on secondary
+        keys, starting from MySQL 3.23.52, you can speed up a table
+        import by turning off the uniqueness checks temporarily during
+        the import operation:
+      </para>
+
+<programlisting>
+SET UNIQUE_CHECKS=0;
+<replaceable>... import operation ...</replaceable>
+SET UNIQUE_CHECKS=1;
+</programlisting>
+
+      <para>
+        For big tables, this saves a lot of disk I/O because
+        <literal>InnoDB</literal> can then use its insert buffer to
+        write secondary index records in a batch. Be certain that the
+        data contains no duplicate keys.
+        <literal>UNIQUE_CHECKS</literal> allows but does not require
+        storage engines to ignore duplicate keys.
+      </para>
+
+      <para>
+        To get better control over the insertion process, it might be
+        good to insert big tables in pieces:
+      </para>
+
+<programlisting>
+INSERT INTO newtable SELECT * FROM oldtable
+   WHERE yourkey &gt; something AND yourkey &lt;= somethingelse;
+</programlisting>
+
+      <para>
+        After all records have been inserted, you can rename the tables.
+      </para>
+
+      <para>
+        During the conversion of big tables, you should increase the
+        size of the <literal>InnoDB</literal> buffer pool to reduce disk
+        I/O. Do not use more than 80% of the physical memory, though.
+        You can also increase the sizes of the <literal>InnoDB</literal>
+        log files.
+      </para>
+
+      <para>
+        Make sure that you do not fill up the tablespace:
+        <literal>InnoDB</literal> tables require a lot more disk space
+        than <literal>MyISAM</literal> tables. If an <literal>ALTER
+        TABLE</literal> operation runs out of space, it starts a
+        rollback, and that can take hours if it is disk-bound. For
+        inserts, <literal>InnoDB</literal> uses the insert buffer to
+        merge secondary index records to indexes in batches. That saves
+        a lot of disk I/O. For rollback, no such mechanism is used, and
+        the rollback can take 30 times longer than the insertion.
+      </para>
+
+      <para>
+        In the case of a runaway rollback, if you do not have valuable
+        data in your database, it may be advisable to kill the database
+        process rather than wait for millions of disk I/O operations to
+        complete. For the complete procedure, see
+        <xref linkend="forcing-recovery"/>.
+      </para>
+
+    </section>
+
+    <section id="innodb-auto-increment-handling">
+
+      <title>How <literal>AUTO_INCREMENT</literal> Handling Works in
+        <literal>InnoDB</literal></title>
+
+      <para>
+        If you specify an <literal>AUTO_INCREMENT</literal> column for
+        an <literal>InnoDB</literal> table, the table handle in the
+        <literal>InnoDB</literal> data dictionary contains a special
+        counter called the auto-increment counter that is used in
+        assigning new values for the column. This counter is stored only
+        in main memory, not on disk.
+      </para>
+
+      <para>
+        <literal>InnoDB</literal> uses the following algorithm to
+        initialize the auto-increment counter for a table
+        <literal>t</literal> that contains an
+        <literal>AUTO_INCREMENT</literal> column named
+        <literal>ai_col</literal>: After a server startup, for the first
+        insert into a table <literal>t</literal>,
+        <literal>InnoDB</literal> executes the equivalent of this
+        statement:
+      </para>
+
+<programlisting>
+SELECT MAX(ai_col) FROM t FOR UPDATE;
+</programlisting>
+
+      <para>
+        <literal>InnoDB</literal> increments by one the value retrieved
+        by the statement and assigns it to the column and to the
+        auto-increment counter for the table. If the table is empty,
+        <literal>InnoDB</literal> uses the value
<literal>1</literal>.
+        If a user invokes a <literal>SHOW TABLE STATUS</literal>
+        statement that displays output for the table
+        <literal>t</literal> and the auto-increment counter has not been
+        initialized, <literal>InnoDB</literal> initializes but does not
+        increment the value and stores it for use by later inserts. This
+        initialization uses a normal exclusive-locking read on the table
+        and the lock lasts to the end of the transaction.
+      </para>
+
+      <para>
+        <literal>InnoDB</literal> follows the same procedure for
+        initializing the auto-increment counter for a freshly created
+        table.
+      </para>
+
+      <para>
+        After the auto-increment counter has been initialized, if a user
+        does not explicitly specify a value for an
+        <literal>AUTO_INCREMENT</literal> column,
+        <literal>InnoDB</literal> increments the counter by one and
+        assigns the new value to the column. If the user inserts a row
+        that explicitly specifies the column value, and the value is
+        bigger than the current counter value, the counter is set to the
+        specified column value.
+      </para>
+
+      <para>
+        When accessing the auto-increment counter,
+        <literal>InnoDB</literal> uses a special table-level
+        <literal>AUTO-INC</literal> lock that it keeps to the end of the
+        current SQL statement, not to the end of the transaction. The
+        special lock release strategy was introduced to improve
+        concurrency for inserts into a table containing an
+        <literal>AUTO_INCREMENT</literal> column. Nevertheless, two
+        transactions cannot have the <literal>AUTO-INC</literal> lock on
+        the same table simultaneously, which can have a performance
+        impact if the <literal>AUTO-INC</literal> lock is held for a
+        long time. That might be the case for a statement such as
+        <literal>INSERT INTO t1 ... SELECT ... FROM t2</literal> that
+        inserts all rows from one table into another.
+      </para>
+
+      <para>
+        <literal>InnoDB</literal> uses the in-memory auto-increment
+        counter as long as the server runs. When the server is stopped
+        and restarted, <literal>InnoDB</literal> reinitializes the
+        counter for each table for the first <literal>INSERT</literal>
+        to the table, as described earlier.
+      </para>
+
+      <para>
+        You may see gaps in the sequence of values assigned to the
+        <literal>AUTO_INCREMENT</literal> column if you roll back
+        transactions that have generated numbers using the counter.
+      </para>
+
+      <para>
+        If a user specifies <literal>NULL</literal> or
+        <literal>0</literal> for the
<literal>AUTO_INCREMENT</literal>
+        column in an <literal>INSERT</literal>,
+        <literal>InnoDB</literal> treats the row as if the value had not
+        been specified and generates a new value for it.
+      </para>
+
+      <para>
+        The behavior of the auto-increment mechanism is not defined if a
+        user assigns a negative value to the column or if the value
+        becomes bigger than the maximum integer that can be stored in
+        the specified integer type.
+      </para>
+
+      <para>
+        An <literal>AUTO_INCREMENT</literal> column must be the first
+        column listed if it is part of a multiple-column index in an
+        <literal>InnoDB</literal> table.
+      </para>
+
+      <para>
+        Beginning with MySQL 4.1.12, <literal>InnoDB</literal> supports
+        the <literal>AUTO_INCREMENT =
+        <replaceable>n</replaceable></literal> table option in
+        <literal>ALTER TABLE</literal> statements, to set the initial
+        counter value or alter the current counter value. The same is
+        true as of MySQL 4.1.14 for <literal>CREATE TABLE</literal>. The
+        effect of this option is canceled by a server restart, for
+        reasons discussed earlier in this section.
+      </para>
+
+    </section>
+
+    <section id="innodb-foreign-key-constraints">
+
+      <title><literal>FOREIGN KEY</literal> Constraints</title>
+
+      <remark role="help-category" condition="Data Definition"/>
+
+      <para>
+        Starting from MySQL 3.23.44, <literal>InnoDB</literal> features
+        foreign key constraints.
+      </para>
+
+      <remark role="help-topic" condition="CONSTRAINT"/>
+
+      <remark role="help-keywords">
+        FOREIGN KEY REFERENCES NO ACTION ON DELETE CASCADE SET NULL
+        RESTRICT UPDATE
+      </remark>
+
+      <remark role="help-description-begin"/>
+
+      <para>
+        The syntax of a foreign key constraint definition in
+        <literal>InnoDB</literal> looks like this:
+      </para>
+
+<programlisting>
+[CONSTRAINT <replaceable>symbol</replaceable>] FOREIGN KEY
[<replaceable>id</replaceable>]
(<replaceable>index_col_name</replaceable>, ...)
+    REFERENCES <replaceable>tbl_name</replaceable>
(<replaceable>index_col_name</replaceable>, ...)
+    [ON DELETE {RESTRICT | CASCADE | SET NULL | NO ACTION}]
+    [ON UPDATE {RESTRICT | CASCADE | SET NULL | NO ACTION}]
+</programlisting>
+
+      <remark role="help-description-end"/>
+
+      <para>
+        Foreign keys definitions are subject to the following
+        conditions:
+      </para>
+
+      <itemizedlist>
+
+        <listitem>
+          <para>
+            Both tables must be <literal>InnoDB</literal> tables and
+            they must not be <literal>TEMPORARY</literal> tables.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            Corresponding columns in the foreign key and the referenced
+            key must have similar internal data types inside
+            <literal>InnoDB</literal> so that they can be compared
+            without a type conversion. <emphasis>The size and sign of
+            integer types must be the same</emphasis>. The length of
+            string types need not be the same. For non-binary
+            (character) string columns, the character set and collation
+            must be the same.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            In the referencing table, there must be an index where the
+            foreign key columns are listed as the
+            <emphasis>first</emphasis> columns in the same order.
+            Starting with MySQL 4.1.2, such an index is created on the
+            referencing table automatically if it does not exist.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            In the referenced table, there must be an index where the
+            referenced columns are listed as the
+            <emphasis>first</emphasis> columns in the same order.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            Index prefixes on foreign key columns are not supported. One
+            consequence of this is that <literal>BLOB</literal> and
+            <literal>TEXT</literal> columns cannot be included in a
+            foreign key, because indexes on those columns must always
+            include a prefix length.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            If the <literal>CONSTRAINT
+            <replaceable>symbol</replaceable></literal> clause is
given,
+            the <replaceable>symbol</replaceable> value must be unique
+            in the database. If the clause is not given,
+            <literal>InnoDB</literal> creates the name automatically.
+          </para>
+        </listitem>
+
+      </itemizedlist>
+
+      <para>
+        <literal>InnoDB</literal> rejects any
<literal>INSERT</literal>
+        or <literal>UPDATE</literal> operation that attempts to create a
+        foreign key value in a child table if there is no a matching
+        candidate key value in the parent table. The action
+        <literal>InnoDB</literal> takes for any
+        <literal>UPDATE</literal> or <literal>DELETE</literal>
operation
+        that attempts to update or delete a candidate key value in the
+        parent table that has some matching rows in the child table is
+        dependent on the <emphasis>referential action</emphasis>
+        specified using <literal>ON UPDATE</literal> and <literal>ON
+        DELETE</literal> subclauses of the <literal>FOREIGN
+        KEY</literal> clause. When the user attempts to delete or update
+        a row from a parent table, and there are one or more matching
+        rows in the child table, <literal>InnoDB</literal> supports five
+        options regarding the action to be taken:
+      </para>
+
+      <itemizedlist>
+
+        <remark role="todo">
+          Rewrite following paragraph to make clearer.
+        </remark>
+
+        <listitem>
+          <para>
+            <literal>CASCADE</literal>: Delete or update the row from
+            the parent table and automatically delete or update the
+            matching rows in the child table. <literal>ON DELETE
+            CASCADE</literal> is supported starting from MySQL 3.23.50
+            and <literal>ON UPDATE CASCADE</literal> is supported
+            starting from 4.0.8. Between two tables, you should not
+            define several <literal>ON UPDATE CASCADE</literal> clauses
+            that act on the same column in the parent table or in the
+            child table.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>SET NULL</literal>: Delete or update the row from
+            the parent table and set the foreign key column or columns
+            in the child table to <literal>NULL</literal>. This is valid
+            only if the foreign key columns do not have the <literal>NOT
+            NULL</literal> qualifier specified. <literal>ON DELETE SET
+            NULL</literal> is available starting from MySQL 3.23.50 and
+            <literal>ON UPDATE SET NULL</literal> is available starting
+            from 4.0.8.
+          </para>
+
+          <para>
+            If you specify a <literal>SET NULL</literal> action,
+            <emphasis>make sure that you have not declared the columns
+            in the child table as <literal>NOT
+            NULL</literal></emphasis>.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>NO ACTION</literal>: In standard sQL, <literal>NO
+            ACTION</literal> means <emphasis>no action</emphasis> in
the
+            sense that an attempt to delete or update a primary key
+            value will not be allowed to proceed if there is a related
+            foreign key value in the referenced table. Starting from
+            4.0.18 <literal>InnoDB</literal> rejects the delete or
+            update operation for the parent table.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>RESTRICT</literal>: Rejects the delete or update
+            operation for the parent table. <literal>NO ACTION</literal>
+            and <literal>RESTRICT</literal> are the same as omitting the
+            <literal>ON DELETE</literal> or <literal>ON
UPDATE</literal>
+            clause. (Some database systems have deferred checks, and
+            <literal>NO ACTION</literal> is a deferred check. In MySQL,
+            foreign key constraints are checked immediately, so
+            <literal>NO ACTION</literal> and
<literal>RESTRICT</literal>
+            are the same.)
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>SET DEFAULT</literal>: This action is recognized by
+            the parser, but <literal>InnoDB</literal> rejects table
+            definitions containing <literal>ON DELETE SET
+            DEFAULT</literal> or <literal>ON UPDATE SET
+            DEFAULT</literal> clauses.
+          </para>
+        </listitem>
+
+      </itemizedlist>
+
+      <para>
+        Note that <literal>InnoDB</literal> supports foreign key
+        references within a table. In these cases, <quote>child table
+        records</quote> really refers to dependent records within the
+        same table.
+      </para>
+
+      <para>
+        <literal>InnoDB</literal> requires indexes on foreign keys and
+        referenced keys so that foreign key checks can be fast and not
+        require a table scan. Starting with MySQL 4.1.2, the index on
+        the foreign key is created automatically. In older versions, the
+        indexes must be created explicitly or the creation of foreign
+        key constraints fails.
+      </para>
+
+      <para>
+        If MySQL reports an error number 1005 from a <literal>CREATE
+        TABLE</literal> statement, and the error message refers to errno
+        150, table creation failed because a foreign key constraint was
+        not correctly formed. Similarly, if an <literal>ALTER
+        TABLE</literal> fails and it refers to errno 150, that means a
+        foreign key definition would be incorrectly formed for the
+        altered table. Starting from MySQL 4.0.13, you can use
+        <literal>SHOW INNODB STATUS</literal> to display a detailed
+        explanation of the latest <literal>InnoDB</literal> foreign key
+        error in the server.
+      </para>
+
+      <para>
+        Starting from MySQL 3.23.50, <literal>InnoDB</literal> does not
+        check foreign key constraints on those foreign key or referenced
+        key values that contain a <literal>NULL</literal> column.
+      </para>
+
+      <note>
+        <para>
+          Currently, triggers are not activated by cascaded foreign key
+          actions.
+        </para>
+      </note>
+
+      <para>
+        You cannot create a table with a column name that matches the
+        name of an internal InnoDB column (including
+        <literal>DB_ROW_ID</literal>,
<literal>DB_TRX_ID</literal>,
+        <literal>DB_ROLL_PTR</literal> and
+        <literal>DB_MIX_ID</literal>). In versions of MySQL before
+        4.1.19 this would cause a crash, since 4.1.19 the server will
+        report error 1005 and refers to <literal>errno</literal> -1 in
+        the error message.
+      </para>
+
+      <para>
+        <emphasis role="bold">Deviation from SQL standards</emphasis>:
+        If there are several rows in the parent table that have the same
+        referenced key value, <literal>InnoDB</literal> acts in foreign
+        key checks as if the other parent rows with the same key value
+        do not exist. For example, if you have defined a
+        <literal>RESTRICT</literal> type constraint, and there is a
+        child row with several parent rows, <literal>InnoDB</literal>
+        does not allow the deletion of any of those parent rows.
+      </para>
+
+      <para>
+        <literal>InnoDB</literal> performs cascading operations through
+        a depth-first algorithm, based on records in the indexes
+        corresponding to the foreign key constraints.
+      </para>
+
+      <para>
+        <emphasis role="bold">Deviation from SQL standards</emphasis>: A
+        <literal>FOREIGN KEY</literal> constraint that references a
+        non-<literal>UNIQUE</literal> key is not standard SQL. It is an
+        <literal>InnoDB</literal> extension to standard SQL.
+      </para>
+
+      <para>
+        <emphasis role="bold">Deviation from SQL standards</emphasis>:
+        If <literal>ON UPDATE CASCADE</literal> or <literal>ON UPDATE
+        SET NULL</literal> recurses to update the <emphasis>same
+        table</emphasis> it has previously updated during the cascade,
+        it acts like <literal>RESTRICT</literal>. This means that you
+        cannot use self-referential <literal>ON UPDATE CASCADE</literal>
+        or <literal>ON UPDATE SET NULL</literal> operations. This is to
+        prevent infinite loops resulting from cascaded updates. A
+        self-referential <literal>ON DELETE SET NULL</literal>, on the
+        other hand, is possible from 4.0.13. A self-referential
+        <literal>ON DELETE CASCADE</literal> has been possible since
+        <literal>ON DELETE</literal> was implemented. Since 4.0.21,
+        cascading operations may not be nested more than 15 levels.
+      </para>
+
+      <para>
+        <emphasis role="bold">Deviation from SQL standards</emphasis>:
+        Like MySQL in general, in an SQL statement that inserts,
+        deletes, or updates many rows, <literal>InnoDB</literal> checks
+        <literal>UNIQUE</literal> and <literal>FOREIGN
KEY</literal>
+        constraints row-by-row. According to the SQL standard, the
+        default behavior should be deferred checking. That is,
+        constraints are only checked after the
+        <emphasis role="bold">whole SQL statement</emphasis> has been
+        processed. Until <literal>InnoDB</literal> implements deferred
+        constraint checking, some things will be impossible, such as
+        deleting a record that refers to itself via a foreign key.
+      </para>
+
+      <para>
+        Here is a simple example that relates <literal>parent</literal>
+        and <literal>child</literal> tables through a single-column
+        foreign key:
+      </para>
+
+<programlisting>
+CREATE TABLE parent (id INT NOT NULL,
+                     PRIMARY KEY (id)
+) TYPE=INNODB;
+CREATE TABLE child (id INT, parent_id INT,
+                    INDEX par_ind (parent_id),
+                    FOREIGN KEY (parent_id) REFERENCES parent(id)
+                      ON DELETE CASCADE
+) TYPE=INNODB;
+</programlisting>
+
+      <para>
+        A more complex example in which a
+        <literal>product_order</literal> table has foreign keys for two
+        other tables. One foreign key references a two-column index in
+        the <literal>product</literal> table. The other references a
+        single-column index in the <literal>customer</literal> table:
+      </para>
+
+      <remark role="help-example"/>
+
+<programlisting>
+CREATE TABLE product (category INT NOT NULL, id INT NOT NULL,
+                      price DECIMAL,
+                      PRIMARY KEY(category, id)) TYPE=INNODB;
+CREATE TABLE customer (id INT NOT NULL,
+                       PRIMARY KEY (id)) TYPE=INNODB;
+CREATE TABLE product_order (no INT NOT NULL AUTO_INCREMENT,
+                            product_category INT NOT NULL,
+                            product_id INT NOT NULL,
+                            customer_id INT NOT NULL,
+                            PRIMARY KEY(no),
+                            INDEX (product_category, product_id),
+                            FOREIGN KEY (product_category, product_id)
+                              REFERENCES product(category, id)
+                              ON UPDATE CASCADE ON DELETE RESTRICT,
+                            INDEX (customer_id),
+                            FOREIGN KEY (customer_id)
+                              REFERENCES customer(id)) TYPE=INNODB;
+</programlisting>
+
+      <para>
+        Starting from MySQL 3.23.50, <literal>InnoDB</literal> allows
+        you to add a new foreign key constraint to a table by using
+        <literal>ALTER TABLE</literal>:
+      </para>
+
+<programlisting>
+ALTER TABLE <replaceable>tbl_name</replaceable>
+    ADD [CONSTRAINT <replaceable>symbol</replaceable>] FOREIGN KEY
[<replaceable>id</replaceable>]
(<replaceable>index_col_name</replaceable>, ...)
+    REFERENCES <replaceable>tbl_name</replaceable>
(<replaceable>index_col_name</replaceable>, ...)
+    [ON DELETE {RESTRICT | CASCADE | SET NULL | NO ACTION}]
+    [ON UPDATE {RESTRICT | CASCADE | SET NULL | NO ACTION}]
+</programlisting>
+
+      <para>
+        <emphasis role="bold">Remember to create the required indexes
+        first</emphasis>. You can also add a self-referential foreign
+        key constraint to a table using <literal>ALTER TABLE</literal>.
+      </para>
+
+      <indexterm>
+        <primary>DROP FOREIGN KEY</primary>
+      </indexterm>
+
+      <indexterm>
+        <primary>deleting</primary>
+        <secondary>foreign key</secondary>
+      </indexterm>
+
+      <indexterm>
+        <primary>foreign key</primary>
+        <secondary>deleting</secondary>
+      </indexterm>
+
+      <para>
+        Starting from MySQL 4.0.13, <literal>InnoDB</literal> supports
+        the use of <literal>ALTER TABLE</literal> to drop foreign keys:
+      </para>
+
+<programlisting>
+ALTER TABLE <replaceable>tbl_name</replaceable> DROP FOREIGN KEY
<replaceable>fk_symbol</replaceable>;
+</programlisting>
+
+      <para>
+        If the <literal>FOREIGN KEY</literal> clause included a
+        <literal>CONSTRAINT</literal> name when you created the foreign
+        key, you can refer to that name to drop the foreign key. (A
+        constraint name can be given as of MySQL 4.0.18.) Otherwise, the
+        <replaceable>fk_symbol</replaceable> value is internally
+        generated by <literal>InnoDB</literal> when the foreign key is
+        created. To find out the symbol when you want to drop a foreign
+        key, use the <literal>SHOW CREATE TABLE</literal> statement. For
+        example:
+      </para>
+
+<programlisting>
+mysql&gt; <userinput>SHOW CREATE TABLE ibtest11c\G</userinput>
+*************************** 1. row ***************************
+       Table: ibtest11c
+Create Table: CREATE TABLE `ibtest11c` (
+  `A` int(11) NOT NULL auto_increment,
+  `D` int(11) NOT NULL default '0',
+  `B` varchar(200) NOT NULL default '',
+  `C` varchar(175) default NULL,
+  PRIMARY KEY  (`A`,`D`,`B`),
+  KEY `B` (`B`,`C`),
+  KEY `C` (`C`),
+  CONSTRAINT `0_38775` FOREIGN KEY (`A`, `D`)
+REFERENCES `ibtest11a` (`A`, `D`)
+ON DELETE CASCADE ON UPDATE CASCADE,
+  CONSTRAINT `0_38776` FOREIGN KEY (`B`, `C`)
+REFERENCES `ibtest11a` (`B`, `C`)
+ON DELETE CASCADE ON UPDATE CASCADE
+) TYPE=InnoDB CHARSET=latin1
+1 row in set (0.01 sec)
+
+mysql&gt; <userinput>ALTER TABLE ibtest11c DROP FOREIGN KEY
`0_38775`;</userinput>
+</programlisting>
+
+      <para>
+        You cannot add a foreign key and drop a foreign key in separate
+        clauses of a single <literal>ALTER TABLE</literal> statement.
+        Separate statements are required.
+      </para>
+
+      <para>
+        If <literal>ALTER TABLE</literal> for an
+        <literal>InnoDB</literal> table results in changes to column
+        values (for example, because a column is truncated),
+        <literal>InnoDB</literal>'s <literal>FOREIGN
KEY</literal>
+        constraint checks do not notice possible violations caused by
+        changing the values.
+      </para>
+
+      <para>
+        Starting from MySQL 3.23.50, the <literal>InnoDB</literal>
+        parser allows table and column identifiers in a <literal>FOREIGN
+        KEY ... REFERENCES ...</literal> clause to be quoted within
+        backticks. (Alternatively, double quotes can be used if the
+        <literal>ANSI_QUOTES</literal> SQL mode is enabled.) The
+        <literal>InnoDB</literal> parser also takes into account the
+        setting of the <literal>lower_case_table_names</literal> system
+        variable.
+      </para>
+
+      <para>
+        Before MySQL 3.23.50, <literal>ALTER TABLE</literal> or
+        <literal>CREATE INDEX</literal> should not be used in connection
+        with tables that have foreign key constraints or that are
+        referenced in foreign key constraints: Any <literal>ALTER
+        TABLE</literal> removes all foreign key constraints defined for
+        the table. You should not use <literal>ALTER TABLE</literal>
+        with the referenced table, either. Instead, use <literal>DROP
+        TABLE</literal> and <literal>CREATE TABLE</literal> to modify
+        the schema. When MySQL does an <literal>ALTER TABLE</literal> it
+        may internally use <literal>RENAME TABLE</literal>, and that
+        confuses the foreign key constraints that refer to the table. In
+        MySQL, a <literal>CREATE INDEX</literal> statement is processed
+        as an <literal>ALTER TABLE</literal>, so the same considerations
+        apply.
+      </para>
+
+      <para>
+        Starting from MySQL 3.23.50, <literal>InnoDB</literal> returns
+        the foreign key definitions of a table as part of the output of
+        the <literal>SHOW CREATE TABLE</literal> statement:
+      </para>
+
+<programlisting>
+SHOW CREATE TABLE <replaceable>tbl_name</replaceable>;
+</programlisting>
+
+      <para>
+        <command>mysqldump</command> also produces correct definitions
+        of tables to the dump file, and does not forget about the
+        foreign keys.
+      </para>
+
+      <para>
+        You can also display the foreign key constraints for a table
+        like this:
+      </para>
+
+<programlisting>
+SHOW TABLE STATUS FROM <replaceable>db_name</replaceable> LIKE
'<replaceable>tbl_name</replaceable>';
+</programlisting>
+
+      <para>
+        The foreign key constraints are listed in the
+        <literal>Comment</literal> column of the output.
+      </para>
+
+      <para>
+        When performing foreign key checks, <literal>InnoDB</literal>
+        sets shared row-level locks on child or parent records it has to
+        look at. <literal>InnoDB</literal> checks foreign key
+        constraints immediately; the check is not deferred to
+        transaction commit.
+      </para>
+
+      <para>
+        To make it easier to reload dump files for tables that have
+        foreign key relationships, <command>mysqldump</command>
+        automatically includes a statement in the dump output to set
+        <literal>FOREIGN_KEY_CHECKS</literal> to 0 as of MySQL 4.1.1.
+        This avoids problems with tables having to be reloaded in a
+        particular order when the dump is reloaded. For earlier
+        versions, you can disable the variable manually within
+        <command>mysql</command> when loading the dump file like this:
+      </para>
+
+<programlisting>
+mysql&gt; <userinput>SET FOREIGN_KEY_CHECKS = 0;</userinput>
+mysql&gt; <userinput>SOURCE
<replaceable>dump_file_name</replaceable>;</userinput>
+mysql&gt; <userinput>SET FOREIGN_KEY_CHECKS = 1;</userinput>
+</programlisting>
+
+      <para>
+        This allows you to import the tables in any order if the dump
+        file contains tables that are not correctly ordered for foreign
+        keys. It also speeds up the import operation. Setting
+        <literal>FOREIGN_KEY_CHECKS</literal> to 0 can also be useful
+        for ignoring foreign key constraints during <literal>LOAD
+        DATA</literal> or <literal>ALTER TABLE</literal> operations.
+        However, even if <literal>FOREIGN_KEY_CHECKS=0</literal>, InnoDB
+        does not allow the creation of a foreign key constraint where a
+        column references a non-matching column type. Also, if an
+        <literal>InnoDB</literal> table has foreign key constraints,
+        <literal>ALTER TABLE</literal> cannot be used to change the
+        table to use another storage engine. To alter the storage
+        engine, you must drop any foreign key constraints first.
+        <literal>FOREIGN_KEY_CHECKS</literal> is available starting from
+        MySQL 3.23.52 and 4.0.3.
+      </para>
+
+      <para>
+        <literal>InnoDB</literal> does not allow you to drop a table
+        that is referenced by a <literal>FOREIGN KEY</literal>
+        constraint, unless you do <literal>SET
+        FOREIGN_KEY_CHECKS=0</literal>. When you drop a table, the
+        constraints that were defined in its create statement are also
+        dropped.
+      </para>
+
+      <para>
+        If you re-create a table that was dropped, it must have a
+        definition that conforms to the foreign key constraints
+        referencing it. It must have the right column names and types,
+        and it must have indexes on the referenced keys, as stated
+        earlier. If these are not satisfied, MySQL returns error number
+        1005 and refers to errno 150 in the error message.
+      </para>
+
+    </section>
+
+    <section id="innodb-and-mysql-replication">
+
+      <title><literal>InnoDB</literal> and MySQL
Replication</title>
+
+      <para>
+        MySQL replication works for <literal>InnoDB</literal> tables as
+        it does for <literal>MyISAM</literal> tables. It is also
+        possible to use replication in a way where the storage engine on
+        the slave is not the same as the original storage engine on the
+        master. For example, you can replicate modifications to an
+        <literal>InnoDB</literal> table on the master to a
+        <literal>MyISAM</literal> table on the slave.
+      </para>
+
+      <para>
+        To set up a new slave for a master, you have to make a copy of
+        the <literal>InnoDB</literal> tablespace and the log files, as
+        well as the <filename>.frm</filename> files of the
+        <literal>InnoDB</literal> tables, and move the copies to the
+        slave. If the <literal>innodb_file_per_table</literal> variable
+        is enabled, you must also copy the <filename>.ibd</filename>
+        files as well. For the proper procedure to do this, see
+        <xref linkend="innodb-backup"/>.
+      </para>
+
+      <para>
+        If you can shut down the master or an existing slave, you can
+        take a cold backup of the <literal>InnoDB</literal> tablespace
+        and log files and use that to set up a slave. To make a new
+        slave without taking down any server you can also use the
+        non-free (commercial)
+        <ulink url="http://www.innodb.com/hot-backup"><literal>InnoDB
+        Hot Backup</literal> tool</ulink>.
+      </para>
+
+      <para>
+        There are minor limitations in <literal>InnoDB</literal>
+        replication:
+      </para>
+
+      <itemizedlist>
+
+        <listitem>
+          <para>
+            <literal>LOAD TABLE FROM MASTER</literal> does not work for
+            <literal>InnoDB</literal> type tables. There are
+            workarounds: 1) dump the table on the master and import the
+            dump file into the slave, or 2) use <literal>ALTER TABLE
+            <replaceable>tbl_name</replaceable> TYPE=MyISAM</literal>
on
+            the master before setting up replication with <literal>LOAD
+            TABLE <replaceable>tbl_name</replaceable> FROM
+            MASTER</literal>, and then use <literal>ALTER
+            TABLE</literal> to alter the master table back to the
+            <literal>InnoDB</literal> type afterward. However, this
+            should not be done for tables that have foreign key
+            definitions because the definitions will be lost.
+          </para>
+        </listitem>
+
+        <listitem>
+          <remark role="note">
+            Do not change SLAVE STOP/START to its current syntax of
+            STOP/START SLAVE in next item. 4.0.6 does not understand
+            STOP/START SLAVE.
+          </remark>
+
+          <para>
+            Before MySQL 4.0.6, <literal>SLAVE STOP</literal> did not
+            respect the boundary of a multiple-statement transaction. An
+            incomplete transaction would be rolled back, and the next
+            <literal>SLAVE START</literal> would only execute the
+            remaining part of the half transaction. That would cause
+            replication to fail.
+          </para>
+        </listitem>
+
+        <listitem>
+          <remark role="note">
+            Do not change SLAVE STOP/START to its current syntax of
+            STOP/START SLAVE in next item. 4.0.6 does not understand
+            STOP/START SLAVE.
+          </remark>
+
+          <para>
+            Before MySQL 4.0.6, a slave crash in the middle of a
+            multiple-statement transaction would cause the same problem
+            as <literal>SLAVE STOP</literal>.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            Before MySQL 4.0.11, replication of the <literal>SET
+            FOREIGN_KEY_CHECKS=0</literal> statement does not work
+            properly.
+          </para>
+        </listitem>
+
+      </itemizedlist>
+
+      <para>
+        Most of these limitations can be eliminated by using more recent
+        server versions for which the limitations do not apply.
+      </para>
+
+      <para>
+        Transactions that fail on the master do not affect replication
+        at all. MySQL replication is based on the binary log where MySQL
+        writes SQL statements that modify data. A transaction that fails
+        (for example, because of a foreign key violation, or because it
+        is rolled back) is not written to the binary log, so it is not
+        sent to slaves. See <xref linkend="commit"/>.
+      </para>
+
+      <important>
+        <para>
+          Cascading actions for <literal>InnoDB</literal> tables on the
+          master are <emphasis>not</emphasis> replicated to the slave.
+          For example, suppose you have two tables defined and populated
+          on both the master and the slave as shown here:
+
+<programlisting>
+CREATE TABLE fc1 (
+    i INT PRIMARY KEY, 
+    j INT
+) ENGINE = InnoDB;
+
+CREATE TABLE fc2 (
+    m INT PRIMARY KEY, 
+    n INT, 
+    FOREIGN KEY mi (m) REFERENCES fc1 (i) 
+        ON DELETE CASCADE
+) ENGINE = InnoDB;
+
+INSERT INTO fc1 VALUES (1, 1), (2, 2);
+INSERT INTO fc2 VALUES (1, 1), (2, 2), (3, 1);
+</programlisting>
+
+          At this point, on both the master and the slave, table
+          <literal>fc1</literal> contains 2 rows, and table
+          <literal>fc2</literal> contains 3 rows. Now suppose that you
+          perform the following <literal>DELETE</literal> statement on
+          the master:
+
+<programlisting>
+DELETE FROM fc1 WHERE i = 1;
+</programlisting>
+
+          Due to the cascade, table <literal>fc2</literal> on the master
+          now contains only 1 row:
+
+<programlisting>
+mysql&gt; <userinput>SELECT * FROM fc2;</userinput>
++---+---+
+| m | n |
++---+---+
+| 2 | 2 |
++---+---+
+1 row in set (0.00 sec)
+</programlisting>
+
+          However, the cascade does not propagate to the slave. The
+          slave&apos;s copy of <literal>fc2</literal> still contains all
+          of the rows that were originally inserted:
+
+<programlisting>
+mysql&gt; <userinput>SELECT * FROM fc2;</userinput>
++---+---+
+| m | n |
++---+---+
+| 1 | 1 |
+| 3 | 1 |
+| 2 | 2 |
++---+---+
+3 rows in set (0.00 sec)
+</programlisting>
+        </para>
+      </important>
+
+    </section>
+
+  </section>
+
+  <section id="adding-and-removing">
+
+    <title>Adding and Removing <literal>InnoDB</literal> Data and Log
Files</title>
+
+    <para>
+      This section describes what you can do when your
+      <literal>InnoDB</literal> tablespace runs out of room or when you
+      want to change the size of the log files.
+    </para>
+
+    <para>
+      From MySQL 3.23.50 and 4.0.2, the easiest way to increase the size
+      of the <literal>InnoDB</literal> tablespace is to configure it
+      from the beginning to be auto-extending. Specify the
+      <literal>autoextend</literal> attribute for the last data file in
+      the tablespace definition. Then <literal>InnoDB</literal>
+      increases the size of that file automatically in 8MB increments
+      when it runs out of space. Starting with MySQL 4.0.24 and 4.1.5,
+      the increment size can be changed by setting the value of the
+      <literal>innodb_autoextend_increment</literal> system variable,
+      which is measured in MB.
+    </para>
+
+    <para>
+      Alternatively, you can increase the size of your tablespace by
+      adding another data file. To do this, you have to shut down the
+      MySQL server, change the tablespace configuration to add a new
+      data file to the end of <literal>innodb_data_file_path</literal>,
+      and start the server again.
+    </para>
+
+    <para>
+      If your last data file was defined with the keyword
+      <literal>autoextend</literal>, the procedure for reconfiguring the
+      tablespace must take into account the size to which the last data
+      file has grown. Obtain the size of the data file, round it down to
+      the closest multiple of 1024 &times; 1024 bytes (= 1MB), and
+      specify the rounded size explicitly in
+      <literal>innodb_data_file_path</literal>. Then you can add another
+      data file. Remember that only the last data file in the
+      <literal>innodb_data_file_path</literal> can be specified as
+      auto-extending.
+    </para>
+
+    <para>
+      As an example, assume that the tablespace has just one
+      auto-extending data file <filename>ibdata1</filename>:
+    </para>
+
+<programlisting>
+innodb_data_home_dir =
+innodb_data_file_path = /ibdata/ibdata1:10M:autoextend
+</programlisting>
+
+    <para>
+      Suppose that this data file, over time, has grown to 988MB. Here
+      is the configuration line after modifying the original data file
+      to not be auto-extending and adding another auto-extending data
+      file:
+    </para>
+
+<programlisting>
+innodb_data_home_dir =
+innodb_data_file_path = /ibdata/ibdata1:988M;/disk2/ibdata2:50M:autoextend
+</programlisting>
+
+    <para>
+      When you add a new file to the tablespace configuration, make sure
+      that it does not exist. <literal>InnoDB</literal> will create and
+      initialize the file when you restart the server.
+    </para>
+
+    <para>
+      Currently, you cannot remove a data file from the tablespace. To
+      decrease the size of your tablespace, use this procedure:
+    </para>
+
+    <orderedlist>
+
+      <listitem>
+        <para>
+          Use <command>mysqldump</command> to dump all your
+          <literal>InnoDB</literal> tables.
+        </para>
+      </listitem>
+
+      <listitem>
+        <para>
+          Stop the server.
+        </para>
+      </listitem>
+
+      <listitem>
+        <para>
+          Remove all the existing tablespace files.
+        </para>
+      </listitem>
+
+      <listitem>
+        <para>
+          Configure a new tablespace.
+        </para>
+      </listitem>
+
+      <listitem>
+        <para>
+          Restart the server.
+        </para>
+      </listitem>
+
+      <listitem>
+        <para>
+          Import the dump files.
+        </para>
+      </listitem>
+
+    </orderedlist>
+
+    <para>
+      If you want to change the number or the size of your
+      <literal>InnoDB</literal> log files, you must stop the MySQL
+      server and make sure that it shuts down without errors (to ensure
+      that there is no information for outstanding transactions in the
+      logs). Then copy the old log files into a safe place just in case
+      something went wrong in the shutdown and you need them to recover
+      the tablespace. Delete the old log files from the log file
+      directory, edit <filename>my.cnf</filename> to change the log file
+      configuration, and start the MySQL server again.
+      <command>mysqld</command> sees that no log files exist at startup
+      and tells you that it is creating new ones.
+    </para>
+
+  </section>
+
+  <section id="innodb-backup">
+
+    <title>Backing Up and Recovering an <literal>InnoDB</literal>
Database</title>
+
+    <para>
+      The key to safe database management is making regular backups.
+    </para>
+
+    <para>
+      <command>InnoDB Hot Backup</command> is an online backup tool you
+      can use to backup your <literal>InnoDB</literal> database while it
+      is running. <command>InnoDB Hot Backup</command> does not require
+      you to shut down your database and it does not set any locks or
+      disturb your normal database processing. <command>InnoDB Hot
+      Backup</command> is a non-free (commercial) add-on tool with an
+      annual license fee of &euro;390 per computer on which the MySQL
+      server is run. See the
+      <ulink url="http://www.innodb.com/hot-backup"><command>InnoDB Hot
+      Backup</command> home page</ulink> for detailed information and
+      screenshots.
+    </para>
+
+    <para>
+      If you are able to shut down your MySQL server, you can make a
+      binary backup that consists of all files used by
+      <literal>InnoDB</literal> to manage its tables. Use the following
+      procedure:
+    </para>
+
+    <orderedlist>
+
+      <listitem>
+        <para>
+          Shut down your MySQL server and make sure that it shuts down
+          without errors.
+        </para>
+      </listitem>
+
+      <listitem>
+        <para>
+          Copy all your data files (<filename>ibdata</filename> files
+          and <filename>.ibd</filename> files) into a safe place.
+        </para>
+      </listitem>
+
+      <listitem>
+        <para>
+          Copy all your <filename>ib_logfile</filename> files to a safe
+          place.
+        </para>
+      </listitem>
+
+      <listitem>
+        <para>
+          Copy your <filename>my.cnf</filename> configuration file or
+          files to a safe place.
+        </para>
+      </listitem>
+
+      <listitem>
+        <para>
+          Copy all the <filename>.frm</filename> files for your
+          <literal>InnoDB</literal> tables to a safe place.
+        </para>
+      </listitem>
+
+    </orderedlist>
+
+    <para>
+      Replication works with <literal>InnoDB</literal> tables, so you
+      can use MySQL replication capabilities to keep a copy of your
+      database at database sites requiring high availability.
+    </para>
+
+    <para>
+      In addition to making binary backups as just described, you should
+      also regularly make dumps of your tables with
+      <command>mysqldump</command>. The reason for this is that a binary
+      file might be corrupted without you noticing it. Dumped tables are
+      stored into text files that are human-readable, so spotting table
+      corruption becomes easier. Also, because the format is simpler,
+      the chance for serious data corruption is smaller.
+      <command>mysqldump</command> also has a
+      <option>--single-transaction</option> option that you can use to
+      make a consistent snapshot without locking out other clients.
+    </para>
+
+    <para>
+      To be able to recover your <literal>InnoDB</literal> database to
+      the present from the binary backup just described, you have to run
+      your MySQL server with binary logging turned on. Then you can
+      apply the binary log to the backup database to achieve
+      point-in-time recovery:
+    </para>
+
+<programlisting>
+mysqlbinlog <replaceable>yourhostname</replaceable>-bin.123 | mysql
+</programlisting>
+
+    <para>
+      To recover from a crash of your MySQL server, the only requirement
+      is to restart it. <literal>InnoDB</literal> automatically checks
+      the logs and performs a roll-forward of the database to the
+      present. <literal>InnoDB</literal> automatically rolls back
+      uncommitted transactions that were present at the time of the
+      crash. During recovery, <command>mysqld</command> displays output
+      something like this:
+    </para>
+
+<programlisting>
+InnoDB: Database was not shut down normally.
+InnoDB: Starting recovery from log files...
+InnoDB: Starting log scan based on checkpoint at
+InnoDB: log sequence number 0 13674004
+InnoDB: Doing recovery: scanned up to log sequence number 0 13739520
+InnoDB: Doing recovery: scanned up to log sequence number 0 13805056
+InnoDB: Doing recovery: scanned up to log sequence number 0 13870592
+InnoDB: Doing recovery: scanned up to log sequence number 0 13936128
+...
+InnoDB: Doing recovery: scanned up to log sequence number 0 20555264
+InnoDB: Doing recovery: scanned up to log sequence number 0 20620800
+InnoDB: Doing recovery: scanned up to log sequence number 0 20664692
+InnoDB: 1 uncommitted transaction(s) which must be rolled back
+InnoDB: Starting rollback of uncommitted transactions
+InnoDB: Rolling back trx no 16745
+InnoDB: Rolling back of trx no 16745 completed
+InnoDB: Rollback of uncommitted transactions completed
+InnoDB: Starting an apply batch of log records to the database...
+InnoDB: Apply batch completed
+InnoDB: Started
+mysqld: ready for connections
+</programlisting>
+
+    <para>
+      If your database gets corrupted or your disk fails, you have to do
+      the recovery from a backup. In the case of corruption, you should
+      first find a backup that is not corrupted. After restoring the
+      base backup, do the recovery from the binary log files using
+      <command>mysqlbinlog</command> and <command>mysql</command>
to
+      restore the changes performed after the backup was made.
+    </para>
+
+    <para>
+      In some cases of database corruption it is enough just to dump,
+      drop, and re-create one or a few corrupt tables. You can use the
+      <literal>CHECK TABLE</literal> SQL statement to check whether a
+      table is corrupt, although <literal>CHECK TABLE</literal>
+      naturally cannot detect every possible kind of corruption. You can
+      use <literal>innodb_tablespace_monitor</literal> to check the
+      integrity of the file space management inside the tablespace
+      files.
+    </para>
+
+    <para>
+      In some cases, apparent database page corruption is actually due
+      to the operating system corrupting its own file cache, and the
+      data on disk may be okay. It is best first to try restarting your
+      computer. Doing so may eliminate errors that appeared to be
+      database page corruption.
+    </para>
+
+    <section id="forcing-recovery">
+
+      <title>Forcing <literal>InnoDB</literal> Recovery</title>
+
+      <para>
+        If there is database page corruption, you may want to dump your
+        tables from the database with <literal>SELECT INTO
+        OUTFILE</literal>. Usually, most of the data obtained in this
+        way is intact. Even so, the corruption may cause <literal>SELECT
+        * FROM <replaceable>tbl_name</replaceable></literal> statements
+        or <literal>InnoDB</literal> background operations to crash or
+        assert, or even to cause <literal>InnoDB</literal> roll-forward
+        recovery to crash. Starting from MySQL 3.23.44, there is an
+        <literal>InnoDB</literal> variable that you can use to force the
+        <literal>InnoDB</literal> storage engine to start up, and you
+        can also prevent background operations from running, so that you
+        are able to dump your tables. For example, you can add the
+        following line to the <literal>[mysqld]</literal> section of
+        your option file before restarting the server:
+      </para>
+
+<programlisting>
+[mysqld]
+innodb_force_recovery = 4
+</programlisting>
+
+      <para>
+        Before MySQL 4.0, use this syntax instead:
+      </para>
+
+<programlisting>
+[mysqld]
+set-variable = innodb_force_recovery=4
+</programlisting>
+
+      <para>
+        The allowable non-zero values for
+        <literal>innodb_force_recovery</literal> follow. A larger number
+        includes all precautions of smaller numbers. If you are able to
+        dump your tables with an option value of at most 4, then you are
+        relatively safe that only some data on corrupt individual pages
+        is lost. A value of 6 is more drastic because database pages are
+        left in an obsolete state, which in turn may introduce more
+        corruption into B-trees and other database structures.
+      </para>
+
+      <itemizedlist>
+
+        <listitem>
+          <para>
+            <literal>1</literal>
+            (<literal>SRV_FORCE_IGNORE_CORRUPT</literal>)
+          </para>
+
+          <para>
+            Let the server run even if it detects a corrupt page. Try to
+            make <literal>SELECT * FROM
+            <replaceable>tbl_name</replaceable></literal> jump over
+            corrupt index records and pages, which helps in dumping
+            tables.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>2</literal>
+            (<literal>SRV_FORCE_NO_BACKGROUND</literal>)
+          </para>
+
+          <para>
+            Prevent the main thread from running. If a crash would occur
+            during the purge operation, this recovery value prevents it.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>3</literal>
+            (<literal>SRV_FORCE_NO_TRX_UNDO</literal>)
+          </para>
+
+          <para>
+            Do not run transaction rollbacks after recovery.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>4</literal>
+            (<literal>SRV_FORCE_NO_IBUF_MERGE</literal>)
+          </para>
+
+          <para>
+            Prevent also insert buffer merge operations. If they would
+            cause a crash, do not do them. Do not calculate table
+            statistics.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>5</literal>
+            (<literal>SRV_FORCE_NO_UNDO_LOG_SCAN</literal>)
+          </para>
+
+          <para>
+            Do not look at undo logs when starting the database:
+            <literal>InnoDB</literal> treats even incomplete
+            transactions as committed.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>6</literal>
+            (<literal>SRV_FORCE_NO_LOG_REDO</literal>)
+          </para>
+
+          <para>
+            Do not do the log roll-forward in connection with recovery.
+          </para>
+        </listitem>
+
+      </itemizedlist>
+
+      <para>
+        Starting from MySQL 3.23.53 and 4.0.4, you can
+        <literal>SELECT</literal> from tables to dump them, or
+        <literal>DROP</literal> or <literal>CREATE</literal> a
table
+        even if forced recovery is used. If you know that a certain
+        table is causing a crash in rollback, you can drop it. You can
+        use this also to stop a runaway rollback caused by a failing
+        mass import or <literal>ALTER TABLE</literal>. You can kill the
+        <command>mysqld</command> process and set
+        <literal>innodb_force_recovery</literal> to
<literal>3</literal>
+        to bring the database up without the rollback, then
+        <literal>DROP</literal> the table that is causing the runaway
+        rollback.
+      </para>
+
+      <para>
+        <emphasis>The database must not otherwise be used with any
+        non-zero value of
+        <literal>innodb_force_recovery</literal></emphasis>. As a
safety
+        measure, <literal>InnoDB</literal> prevents users from
+        performing <literal>INSERT</literal>,
<literal>UPDATE</literal>,
+        or <literal>DELETE</literal> operations when
+        <literal>innodb_force_recovery</literal> is greater than 0.
+      </para>
+
+    </section>
+
+    <section id="innodb-checkpoints">
+
+      <title>Checkpoints</title>
+
+      <para>
+        <literal>InnoDB</literal> implements a checkpoint mechanism
+        known as <quote>fuzzy</quote> checkpointing.
+        <literal>InnoDB</literal> flushes modified database pages from
+        the buffer pool in small batches. There is no need to flush the
+        buffer pool in one single batch, which would in practice stop
+        processing of user SQL statements during the checkpointing
+        process.
+      </para>
+
+      <para>
+        During crash recovery, <literal>InnoDB</literal> looks for a
+        checkpoint label written to the log files. It knows that all
+        modifications to the database before the label are present in
+        the disk image of the database. Then <literal>InnoDB</literal>
+        scans the log files forward from the checkpoint, applying the
+        logged modifications to the database.
+      </para>
+
+      <para>
+        <literal>InnoDB</literal> writes to its log files on a rotating
+        basis. All committed modifications that make the database pages
+        in the buffer pool different from the images on disk must be
+        available in the log files in case <literal>InnoDB</literal> has
+        to do a recovery. This means that when <literal>InnoDB</literal>
+        starts to reuse a log file, it has to make sure that the
+        database page images on disk contain the modifications logged in
+        the log file that <literal>InnoDB</literal> is going to reuse.
+        In other words, <literal>InnoDB</literal> must create a
+        checkpoint and this often involves flushing of modified database
+        pages to disk.
+      </para>
+
+      <para>
+        The preceding description explains why making your log files
+        very large may save disk I/O in checkpointing. It often makes
+        sense to set the total size of the log files as big as the
+        buffer pool or even bigger. The drawback of using large log
+        files is that crash recovery can take longer because there is
+        more logged information to apply to the database.
+      </para>
+
+    </section>
+
+  </section>
+
+  <section id="moving">
+
+    <title>Moving an <literal>InnoDB</literal> Database to Another
Machine</title>
+
+    <para>
+      On Windows, <literal>InnoDB</literal> always stores database and
+      table names internally in lowercase. To move databases in a binary
+      format from Unix to Windows or from Windows to Unix, you should
+      have all table and database names in lowercase. A convenient way
+      to accomplish this is to add the following line to the
+      <literal>[mysqld]</literal> section of your
+      <filename>my.cnf</filename> or <filename>my.ini</filename>
file
+      before creating any databases or tables:
+    </para>
+
+<programlisting>
+[mysqld]
+set-variable = lower_case_table_names=1
+</programlisting>
+
+    <para>
+      Like <literal>MyISAM</literal> data files,
+      <literal>InnoDB</literal> data and log files are binary-compatible
+      on all platforms having the same floating-point number format. You
+      can move an <literal>InnoDB</literal> database simply by copying
+      all the relevant files listed in <xref linkend="innodb-backup"/>.
+      If the floating-point formats differ but you have not used
+      <literal>FLOAT</literal> or <literal>DOUBLE</literal> data
types
+      in your tables, then the procedure is the same: simply copy the
+      relevant files. If the formats differ and your tables contain
+      floating-point data, you must use <command>mysqldump</command> to
+      dump your tables on one machine and then import the dump files on
+      the other machine.
+    </para>
+
+    <para>
+      One way to increase performance is to switch off autocommit mode
+      when importing data, assuming that the tablespace has enough space
+      for the big rollback segment that the import transactions
+      generate. Do the commit only after importing a whole table or a
+      segment of a table.
+    </para>
+
+  </section>
+
+  <section id="innodb-transaction-model">
+
+    <title><literal>InnoDB</literal> Transaction Model and
Locking</title>
+
+    <para>
+      In the <literal>InnoDB</literal> transaction model, the goal is to
+      combine the best properties of a multi-versioning database with
+      traditional two-phase locking. <literal>InnoDB</literal> does
+      locking on the row level and runs queries as non-locking
+      consistent reads by default, in the style of Oracle. The lock
+      table in <literal>InnoDB</literal> is stored so space-efficiently
+      that lock escalation is not needed: Typically several users are
+      allowed to lock every row in the database, or any random subset of
+      the rows, without <literal>InnoDB</literal> running out of memory.
+    </para>
+
+    <section id="innodb-lock-modes">
+
+      <title><literal>InnoDB</literal> Lock Modes</title>
+
+      <para>
+        <literal>InnoDB</literal> implements standard row-level locking
+        where there are two types of locks:
+      </para>
+
+      <itemizedlist>
+
+        <listitem>
+          <para>
+            A shared (<replaceable>S</replaceable>) lock allows a
+            transaction to read a row (tuple).
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            An exclusive (<replaceable>X</replaceable>) lock allows a
+            transaction to update or delete a row.
+          </para>
+        </listitem>
+
+      </itemizedlist>
+
+      <para>
+        If transaction <literal>T1</literal> holds a shared
+        (<replaceable>S</replaceable>) lock on tuple
+        <literal>t</literal>, then
+      </para>
+
+      <itemizedlist>
+
+        <listitem>
+          <para>
+            A request from some distinct transaction
+            <literal>T2</literal> for an
<replaceable>S</replaceable>
+            lock on <literal>t</literal> can be granted immediately. As
+            a result, both <literal>T1</literal> and
+            <literal>T2</literal> hold an
<replaceable>S</replaceable>
+            lock on <literal>t</literal>.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            A request from some distinct transaction
+            <literal>T2</literal> for an
<replaceable>X</replaceable>
+            lock on <literal>t</literal> cannot be granted immediately.
+          </para>
+        </listitem>
+
+      </itemizedlist>
+
+      <para>
+        If a transaction <literal>T1</literal> holds an exclusive
+        (<replaceable>X</replaceable>) lock on tuple
+        <literal>t</literal>, then a request from some distinct
+        transaction <literal>T2</literal> for a lock of either type on
+        <literal>t</literal> cannot be granted immediately. Instead,
+        transaction <literal>T2</literal> has to wait for transaction
+        <literal>T1</literal> to release its lock on tuple
+        <literal>t</literal>.
+      </para>
+
+      <para>
+        Additionally, <literal>InnoDB</literal> supports
+        <emphasis>multiple granularity locking</emphasis> which allows
+        coexistence of record locks and locks on entire tables. To make
+        locking at multiple granularity levels practical, additional
+        types of locks called <emphasis>intention locks</emphasis> are
+        used. Intention locks are table locks in
+        <literal>InnoDB</literal>. The idea behind intention locks is
+        for a transaction to indicate which type of lock (shared or
+        exclusive) it will require later for a row in that table. There
+        are two types of intention locks used in
+        <literal>InnoDB</literal> (assume that transaction
+        <literal>T</literal> has requested a lock of the indicated type
+        on table <literal>R</literal>):
+      </para>
+
+      <itemizedlist>
+
+        <listitem>
+          <para>
+            Intention shared (<replaceable>IS</replaceable>):
+            Transaction <literal>T</literal> intends to set
+            <replaceable>S</replaceable> locks on individual rows in
+            table <literal>R</literal>.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            Intention exclusive (<replaceable>IX</replaceable>):
+            Transaction <literal>T</literal> intends to set
+            <replaceable>X</replaceable> locks on those rows.
+          </para>
+        </listitem>
+
+      </itemizedlist>
+
+      <para>
+        The intention locking protocol is as follows:
+      </para>
+
+      <itemizedlist>
+
+        <listitem>
+          <para>
+            Before a given transaction can acquire an
+            <replaceable>S</replaceable> lock on a given row, it must
+            first acquire an <replaceable>IS</replaceable> or stronger
+            lock on the table containing that row.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            Before a given transaction can acquire an
+            <replaceable>X</replaceable> lock on a given row, it must
+            first acquire an <replaceable>IX</replaceable> lock on the
+            table containing that row.
+          </para>
+        </listitem>
+
+      </itemizedlist>
+
+      <para>
+        These rules can be conveniently summarized by means of a
+        <emphasis>lock type compatibility matrix</emphasis>:
+      </para>
+
+      <informaltable>
+        <tgroup cols="5">
+          <colspec colwidth="20*"/>
+          <colspec colwidth="20*"/>
+          <colspec colwidth="20*"/>
+          <colspec colwidth="20*"/>
+          <colspec colwidth="20*"/>
+          <tbody>
+            <row>
+              <entry/>
+              <entry><replaceable>X</replaceable></entry>
+              <entry><replaceable>IX</replaceable></entry>
+              <entry><replaceable>S</replaceable></entry>
+              <entry><replaceable>IS</replaceable></entry>
+            </row>
+            <row>
+              <entry><replaceable>X</replaceable></entry>
+              <entry>Conflict</entry>
+              <entry>Conflict</entry>
+              <entry>Conflict</entry>
+              <entry>Conflict</entry>
+            </row>
+            <row>
+              <entry><replaceable>IX</replaceable></entry>
+              <entry>Conflict</entry>
+              <entry>Compatible</entry>
+              <entry>Conflict</entry>
+              <entry>Compatible</entry>
+            </row>
+            <row>
+              <entry><replaceable>S</replaceable></entry>
+              <entry>Conflict</entry>
+              <entry>Conflict</entry>
+              <entry>Compatible</entry>
+              <entry>Compatible</entry>
+            </row>
+            <row>
+              <entry><replaceable>IS</replaceable></entry>
+              <entry>Conflict</entry>
+              <entry>Compatible</entry>
+              <entry>Compatible</entry>
+              <entry>Compatible</entry>
+            </row>
+          </tbody>
+        </tgroup>
+      </informaltable>
+
+      <para>
+        A lock is granted to a requesting transaction if it is
+        compatible with existing locks. A lock is not granted to a
+        requesting transaction if it conflicts with existing locks. A
+        transaction waits until the conflicting existing lock is
+        released. If a lock request conflicts with an existing lock and
+        cannot be granted because it would cause deadlock, an error
+        occurs.
+      </para>
+
+      <para>
+        Thus, intention locks do not block anything except full table
+        requests (for example, <literal>LOCK TABLES ...
+        WRITE</literal>). The main purpose of
+        <replaceable>IX</replaceable> and
<replaceable>IS</replaceable>
+        locks is to show that someone is locking a row, or going to lock
+        a row in the table.
+      </para>
+
+      <para>
+        The following example illustrates how an error can occur when a
+        lock request would cause a deadlock. The example involves two
+        clients, A and B.
+      </para>
+
+      <para>
+        First, client A creates a table containing one row, and then
+        begins a transaction. Within the transaction, A obtains an
+        <replaceable>S</replaceable> lock on the row by selecting it in
+        share mode:
+      </para>
+
+<programlisting>
+<!--
+mysql> DROP TABLE IF EXISTS t;
+Query OK, 0 rows affected (0.26 sec)
+-->
+mysql&gt; <userinput>CREATE TABLE t (i INT) ENGINE = InnoDB;</userinput>
+Query OK, 0 rows affected (1.07 sec)
+
+mysql&gt; <userinput>INSERT INTO t (i) VALUES(1);</userinput>
+Query OK, 1 row affected (0.09 sec)
+
+mysql&gt; <userinput>START TRANSACTION;</userinput>
+Query OK, 0 rows affected (0.00 sec)
+
+mysql&gt; <userinput>SELECT * FROM t WHERE i = 1 LOCK IN SHARE
MODE;</userinput>
++------+
+| i    |
++------+
+|    1 |
++------+
+1 row in set (0.10 sec)
+</programlisting>
+
+      <para>
+        Next, client B begins a transaction and attempts to delete the
+        row from the table:
+      </para>
+
+<programlisting>
+mysql&gt; <userinput>START TRANSACTION;</userinput>
+Query OK, 0 rows affected (0.00 sec)
+
+mysql&gt; <userinput>DELETE FROM t WHERE i = 1;</userinput>
+</programlisting>
+
+      <para>
+        The delete operation requires an <replaceable>X</replaceable>
+        lock. The lock cannot be granted because it is incompatible with
+        the <replaceable>S</replaceable> lock that client A holds, so
+        the request goes on the queue of lock requests for the row and
+        client B blocks.
+      </para>
+
+      <para>
+        Finally, client A also attempts to delete the row from the
+        table:
+      </para>
+
+<programlisting>
+mysql&gt; <userinput>DELETE FROM t WHERE i = 1;</userinput>
+ERROR 1213 (40001): Deadlock found when trying to get lock;
+try restarting transaction
+</programlisting>
+
+      <para>
+        Deadlock occurs here because client A needs an
+        <replaceable>X</replaceable> lock to delete the row. However,
+        that lock request cannot be granted because client B already has
+        a request for an <replaceable>X</replaceable> lock and is
+        waiting for client A to release its <replaceable>S</replaceable>
+        lock. Nor can the <replaceable>S</replaceable> lock held by A be
+        upgraded to an <replaceable>X</replaceable> lock because of the
+        prior request by B for an <replaceable>X</replaceable> lock. As
+        a result, <literal>InnoDB</literal> generates an error for
+        client A and releases its locks. At that point, the lock request
+        for client B can be granted and B deletes the row from the
+        table.
+      </para>
+
+    </section>
+
+    <section id="innodb-and-autocommit">
+
+      <title><literal>InnoDB</literal> and
<literal>AUTOCOMMIT</literal></title>
+
+      <para>
+        In <literal>InnoDB</literal>, all user activity occurs inside a
+        transaction. If the autocommit mode is enabled, each SQL
+        statement forms a single transaction on its own. By default,
+        MySQL starts new connections with autocommit enabled.
+      </para>
+
+      <para>
+        If the autocommit mode is switched off with <literal>SET
+        AUTOCOMMIT = 0</literal>, then we can consider that a user
+        always has a transaction open. An SQL <literal>COMMIT</literal>
+        or <literal>ROLLBACK</literal> statement ends the current
+        transaction and a new one starts. A <literal>COMMIT</literal>
+        means that the changes made in the current transaction are made
+        permanent and become visible to other users. A
+        <literal>ROLLBACK</literal> statement, on the other hand,
+        cancels all modifications made by the current transaction. Both
+        statements release all <literal>InnoDB</literal> locks that were
+        set during the current transaction.
+      </para>
+
+      <para>
+        If the connection has autocommit enabled, the user can still
+        perform a multiple-statement transaction by starting it with an
+        explicit <literal>START TRANSACTION</literal> or
+        <literal>BEGIN</literal> statement and ending it with
+        <literal>COMMIT</literal> or <literal>ROLLBACK</literal>.
+      </para>
+
+    </section>
+
+    <section id="innodb-transaction-isolation">
+
+      <title><literal>InnoDB</literal> and <literal>TRANSACTION
ISOLATION
+        LEVEL</literal></title>
+
+      <para>
+        In terms of the SQL:1992 transaction isolation levels, the
+        <literal>InnoDB</literal> default is <literal>REPEATABLE
+        READ</literal>. Starting from MySQL 4.0.5,
+        <literal>InnoDB</literal> offers all four different transaction
+        isolation levels described by the SQL standard. You can set the
+        default isolation level for all connections by using the
+        <option>--transaction-isolation</option> option on the command
+        line or in an option file. For example, you can set the option
+        in the <literal>[mysqld]</literal> section of an option file
+        like this:
+      </para>
+
+<programlisting>
+[mysqld]
+transaction-isolation = {READ-UNCOMMITTED | READ-COMMITTED
+                         | REPEATABLE-READ | SERIALIZABLE}
+</programlisting>
+
+      <para>
+        A user can change the isolation level for a single session or
+        for all new incoming connections with the <literal>SET
+        TRANSACTION</literal> statement. Its syntax is as follows:
+      </para>
+
+<programlisting>
+SET [SESSION | GLOBAL] TRANSACTION ISOLATION LEVEL
+                       {READ UNCOMMITTED | READ COMMITTED
+                        | REPEATABLE READ | SERIALIZABLE}
+</programlisting>
+
+      <para>
+        Note that there are hyphens in the level names for the
+        <option>--transaction-isolation</option> option, but not for the
+        <literal>SET TRANSACTION</literal> statement.
+      </para>
+
+      <para>
+        The default behavior is to set the isolation level for the next
+        (not started) transaction. If you use the
+        <literal>GLOBAL</literal> keyword, the statement sets the
+        default transaction level globally for all new connections
+        created from that point on (but not for existing connections).
+        You need the <literal>SUPER</literal> privilege to do this.
+        Using the <literal>SESSION</literal> keyword sets the default
+        transaction level for all future transactions performed on the
+        current connection.
+      </para>
+
+      <para>
+        Any client is free to change the session isolation level (even
+        in the middle of a transaction), or the isolation level for the
+        next transaction.
+      </para>
+
+      <para>
+        Before MySQL 3.23.50, <literal>SET TRANSACTION</literal> had no
+        effect on <literal>InnoDB</literal> tables. Before 4.0.5, only
+        <literal>REPEATABLE READ</literal> and
+        <literal>SERIALIZABLE</literal> were available.
+      </para>
+
+      <para>
+        You can determine the global and session transaction isolation
+        levels by checking the value of the
+        <literal>tx_isolation</literal> system variable with these
+        statements:
+      </para>
+
+<programlisting>
+SELECT @@global.tx_isolation;
+SELECT @@tx_isolation;
+</programlisting>
+
+      <indexterm>
+        <primary>next-key lock</primary>
+      </indexterm>
+
+      <indexterm>
+        <primary>gap lock</primary>
+      </indexterm>
+
+      <para>
+        In row-level locking, <literal>InnoDB</literal> uses next-key
+        locking. That means that besides index records,
+        <literal>InnoDB</literal> can also lock the
<quote>gap</quote>
+        preceding an index record to block insertions by other users
+        immediately before the index record. A next-key lock refers to a
+        lock that locks an index record and the gap before it. A gap
+        lock refers to a lock that only locks a gap before some index
+        record. Next-key locking for searches or index scans can be
+        disabled by enabling the
+        <literal>innodb_locks_unsafe_for_binlog</literal> system
+        variable.
+      </para>
+
+      <para>
+        A detailed description of each isolation level in
+        <literal>InnoDB</literal> follows:
+      </para>
+
+      <itemizedlist>
+
+        <listitem>
+          <para>
+            <literal>READ UNCOMMITTED</literal>
+          </para>
+
+          <para>
+            <literal>SELECT</literal> statements are performed in a
+            non-locking fashion, but a possible earlier version of a
+            record might be used. Thus, using this isolation level, such
+            reads are not consistent. This is also called a <quote>dirty
+            read.</quote> Otherwise, this isolation level works like
+            <literal>READ COMMITTED</literal>.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>READ COMMITTED</literal>
+          </para>
+
+          <para>
+            A somewhat Oracle-like isolation level. All <literal>SELECT
+            ... FOR UPDATE</literal> and <literal>SELECT ... LOCK IN
+            SHARE MODE</literal> statements lock only the index records,
+            not the gaps before them, and thus allow the free insertion
+            of new records next to locked records.
+            <literal>UPDATE</literal> and
<literal>DELETE</literal>
+            statements using a unique index with a unique search
+            condition lock only the index record found, not the gap
+            before it. In range-type <literal>UPDATE</literal> and
+            <literal>DELETE</literal> statements,
+            <literal>InnoDB</literal> must set next-key or gap locks and
+            block insertions by other users to the gaps covered by the
+            range. This is necessary because <quote>phantom rows</quote>
+            must be blocked for MySQL replication and recovery to work.
+          </para>
+
+          <para>
+            Consistent reads behave as in Oracle: Each consistent read,
+            even within the same transaction, sets and reads its own
+            fresh snapshot. See
+            <xref linkend="innodb-consistent-read"/>.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>REPEATABLE READ</literal>
+          </para>
+
+          <para>
+            This is the default isolation level of
+            <literal>InnoDB</literal>. <literal>SELECT ... FOR
+            UPDATE</literal>, <literal>SELECT ... LOCK IN SHARE
+            MODE</literal>, <literal>UPDATE</literal>, and
+            <literal>DELETE</literal> statements that use a unique index
+            with a unique search condition lock only the index record
+            found, not the gap before it. With other search conditions,
+            these operations employ next-key locking, locking the index
+            range scanned with next-key or gap locks, and block new
+            insertions by other users.
+          </para>
+
+          <para>
+            In consistent reads, there is an important difference from
+            the <literal>READ COMMITTED</literal> isolation level: All
+            consistent reads within the same transaction read the same
+            snapshot established by the first read. This convention
+            means that if you issue several plain
+            <literal>SELECT</literal> statements within the same
+            transaction, these <literal>SELECT</literal> statements are
+            consistent also with respect to each other. See
+            <xref linkend="innodb-consistent-read"/>.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>SERIALIZABLE</literal>
+          </para>
+
+          <para>
+            This level is like <literal>REPEATABLE READ</literal>, but
+            <literal>InnoDB</literal> implicitly converts all plain
+            <literal>SELECT</literal> statements to <literal>SELECT ...
+            LOCK IN SHARE MODE</literal>.
+          </para>
+        </listitem>
+
+      </itemizedlist>
+
+    </section>
+
+    <section id="innodb-consistent-read">
+
+      <title>Consistent Non-Locking Read</title>
+
+      <para>
+        A consistent read means that <literal>InnoDB</literal> uses
+        multi-versioning to present to a query a snapshot of the
+        database at a point in time. The query sees the changes made by
+        those transactions that committed before that point of time, and
+        no changes made by later or uncommitted transactions. The
+        exception to this rule is that the query sees the changes made
+        by earlier statements within the same transaction. Note that the
+        exception to the rule causes the following anomaly: if you
+        update some rows in a table, a <literal>SELECT</literal> will
+        see the latest version of the updated rows, but it might also
+        see older versions of any rows. If other users simultaneously
+        update the same table, the anomaly means that you may see the
+        table in a state that never existed in the database.
+      </para>
+
+      <para>
+        If you are running with the default <literal>REPEATABLE
+        READ</literal> isolation level, all consistent reads within the
+        same transaction read the snapshot established by the first such
+        read in that transaction. You can get a fresher snapshot for
+        your queries by committing the current transaction and after
+        that issuing new queries.
+      </para>
+
+      <para>
+        Consistent read is the default mode in which
+        <literal>InnoDB</literal> processes
<literal>SELECT</literal>
+        statements in <literal>READ COMMITTED</literal> and
+        <literal>REPEATABLE READ</literal> isolation levels. A
+        consistent read does not set any locks on the tables it
+        accesses, and therefore other users are free to modify those
+        tables at the same time a consistent read is being performed on
+        the table.
+      </para>
+
+      <para>
+        Note that consistent read does not work over <literal>DROP
+        TABLE</literal> and over <literal>ALTER TABLE</literal>.
+        Consistent read does not work over <literal>DROP TABLE</literal>
+        because MySQL can't use a table that has been dropped and
+        <literal>InnoDB</literal> destroys the table. Consistent read
+        does not work over <literal>ALTER TABLE</literal> because
+        <literal>ALTER TABLE</literal> works by making a temporary copy
+        of the original table and deleting the original table when the
+        temporary copy is built. When you reissue a consistent read
+        within a transaction, rows in the new table are not visible
+        because those rows did not exist when the transaction's snapshot
+        was taken.
+      </para>
+
+    </section>
+
+    <section id="innodb-locking-reads">
+
+      <title><literal>SELECT ... FOR UPDATE</literal> and
<literal>SELECT ... LOCK IN
+        SHARE MODE</literal> Locking Reads</title>
+
+      <para>
+        In some circumstances, a consistent read is not convenient. For
+        example, you might want to add a new row into your table
+        <literal>child</literal>, and make sure that the child has a
+        parent in table <literal>parent</literal>. The following example
+        shows how to implement referential integrity in your application
+        code.
+      </para>
+
+      <para>
+        Suppose that you use a consistent read to read the table
+        <literal>parent</literal> and indeed see the parent of the child
+        in the table. Can you safely add the child row to table
+        <literal>child</literal>? No, because it may happen that
+        meanwhile some other user deletes the parent row from the table
+        <literal>parent</literal> without you being aware of it.
+      </para>
+
+      <para>
+        The solution is to perform the <literal>SELECT</literal> in a
+        locking mode using <literal>LOCK IN SHARE MODE</literal>:
+      </para>
+
+<programlisting>
+SELECT * FROM parent WHERE NAME = 'Jones' LOCK IN SHARE MODE;
+</programlisting>
+
+      <para>
+        Performing a read in share mode means that we read the latest
+        available data, and set a shared mode lock on the rows we read.
+        A shared mode lock prevents others from updating or deleting the
+        row we have read. Also, if the latest data belongs to a yet
+        uncommitted transaction of another client connection, we wait
+        until that transaction commits. After we see that the preceding
+        query returns the parent <literal>'Jones'</literal>, we can
+        safely add the child record to the <literal>child</literal>
+        table and commit our transaction.
+      </para>
+
+      <para>
+        Let us look at another example: We have an integer counter field
+        in a table <literal>child_codes</literal> that we use to assign
+        a unique identifier to each child added to table
+        <literal>child</literal>. Obviously, using a consistent read or
+        a shared mode read to read the present value of the counter is
+        not a good idea because two users of the database may then see
+        the same value for the counter, and a duplicate-key error occurs
+        if two users attempt to add children with the same identifier to
+        the table.
+      </para>
+
+      <para>
+        Here, <literal>LOCK IN SHARE MODE</literal> is not a good
+        solution because if two users read the counter at the same time,
+        at least one of them ends up in deadlock when attempting to
+        update the counter.
+      </para>
+
+      <para>
+        In this case, there are two good ways to implement the reading
+        and incrementing of the counter: (1) update the counter first by
+        incrementing it by 1 and only after that read it, or (2) read
+        the counter first with a lock mode <literal>FOR
+        UPDATE</literal>, and increment after that. The latter approach
+        can be implemented as follows:
+      </para>
+
+<programlisting>
+SELECT counter_field FROM child_codes FOR UPDATE;
+UPDATE child_codes SET counter_field = counter_field + 1;
+</programlisting>
+
+      <para>
+        A <literal>SELECT ... FOR UPDATE</literal> reads the latest
+        available data, setting exclusive locks on each row it reads.
+        Thus, it sets the same locks a searched SQL
+        <literal>UPDATE</literal> would set on the rows.
+      </para>
+
+      <para>
+        The preceding description is merely an example of how
+        <literal>SELECT ... FOR UPDATE</literal> works. In MySQL, the
+        specific task of generating a unique identifier actually can be
+        accomplished using only a single access to the table:
+      </para>
+
+<programlisting>
+UPDATE child_codes SET counter_field = LAST_INSERT_ID(counter_field + 1);
+SELECT LAST_INSERT_ID();
+</programlisting>
+
+      <para>
+        The <literal>SELECT</literal> statement merely retrieves the
+        identifier information (specific to the current connection). It
+        does not access any table.
+      </para>
+
+      <para>
+        Locks set by <literal>IN SHARE MODE</literal> and <literal>FOR
+        UPDATE</literal> reads are released when the transaction is
+        committed or rolled back.
+      </para>
+
+      <note>
+        <para>
+          Locking of rows for update using <literal>SELECT FOR
+          UPDATE</literal> only applies when
+          <literal>AUTOCOMMIT</literal> is switched off. If
+          <literal>AUTOCOMMIT</literal> is on, then the rows matching
+          the specification are not locked.
+        </para>
+      </note>
+
+    </section>
+
+    <section id="innodb-next-key-locking">
+
+      <title>Next-Key Locking: Avoiding the Phantom Problem</title>
+
+      <indexterm>
+        <primary>next-key lock</primary>
+      </indexterm>
+
+      <indexterm>
+        <primary>gap lock</primary>
+      </indexterm>
+
+      <para>
+        In row-level locking, <literal>InnoDB</literal> uses an
+        algorithm called <emphasis>next-key locking</emphasis>.
+        <literal>InnoDB</literal> performs the row-level locking in such
+        a way that when it searches or scans an index of a table, it
+        sets shared or exclusive locks on the index records it
+        encounters. Thus, the row-level locks are actually index record
+        locks.
+      </para>
+
+      <para>
+        The next-key locks that <literal>InnoDB</literal> sets on index
+        records also affect the <quote>gap</quote> before that index
+        record. If a user has a shared or exclusive lock on record
+        <literal>R</literal> in an index, another user cannot insert a
+        new index record immediately before <literal>R</literal> in the
+        index order. (A gap lock refers to a lock that only locks a gap
+        before some index record.)
+      </para>
+
+      <para>
+        This next-key locking of gaps is done to prevent the so-called
+        <quote>phantom problem.</quote> Suppose that you want to read
+        and lock all children from the <literal>child</literal> table
+        having an identifier value greater than 100, with the intention
+        of updating some column in the selected rows later:
+      </para>
+
+<programlisting>
+SELECT * FROM child WHERE id &gt; 100 FOR UPDATE;
+</programlisting>
+
+      <para>
+        Suppose that there is an index on the <literal>id</literal>
+        column. The query scans that index starting from the first
+        record where <literal>id</literal> is bigger than 100. If the
+        locks set on the index records would not lock out inserts made
+        in the gaps, a new row might meanwhile be inserted to the table.
+        If you execute the same <literal>SELECT</literal> within the
+        same transaction, you would see a new row in the result set
+        returned by the query. This is contrary to the isolation
+        principle of transactions: A transaction should be able to run
+        so that the data it has read does not change during the
+        transaction. If we regard a set of rows as a data item, the new
+        <quote>phantom</quote> child would violate this isolation
+        principle.
+      </para>
+
+      <para>
+        When <literal>InnoDB</literal> scans an index, it can also lock
+        the gap after the last record in the index. Just that happens in
+        the previous example: The locks set by <literal>InnoDB</literal>
+        prevent any insert to the table where <literal>id</literal>
+        would be bigger than 100.
+      </para>
+
+      <para>
+        You can use next-key locking to implement a uniqueness check in
+        your application: If you read your data in share mode and do not
+        see a duplicate for a row you are going to insert, then you can
+        safely insert your row and know that the next-key lock set on
+        the successor of your row during the read prevents anyone
+        meanwhile inserting a duplicate for your row. Thus, the next-key
+        locking allows you to <quote>lock</quote> the non-existence of
+        something in your table.
+      </para>
+
+    </section>
+
+    <section id="innodb-consistent-read-example">
+
+      <title>An Example of Consistent Read in
<literal>InnoDB</literal></title>
+
+      <para>
+        Suppose that you are running in the default <literal>REPEATABLE
+        READ</literal> isolation level. When you issue a consistent read
+        (that is, an ordinary <literal>SELECT</literal> statement),
+        <literal>InnoDB</literal> gives your transaction a timepoint
+        according to which your query sees the database. If another
+        transaction deletes a row and commits after your timepoint was
+        assigned, you do not see the row as having been deleted. Inserts
+        and updates are treated similarly.
+      </para>
+
+      <para>
+        You can advance your timepoint by committing your transaction
+        and then doing another <literal>SELECT</literal>.
+      </para>
+
+      <para>
+        This is called <firstterm>multi-versioned concurrency
+        control</firstterm>.
+      </para>
+
+      <remark role="todo">
+        More ASCII art that needs to be turned into a graphic.
+      </remark>
+
+<programlisting>
+               User A                 User B
+
+           SET AUTOCOMMIT=0;      SET AUTOCOMMIT=0;
+time
+|          SELECT * FROM t;
+|          empty set
+|                                 INSERT INTO t VALUES (1, 2);
+|
+v          SELECT * FROM t;
+           empty set
+                                  COMMIT;
+
+           SELECT * FROM t;
+           empty set
+
+           COMMIT;
+
+           SELECT * FROM t;
+           ---------------------
+           |    1    |    2    |
+           ---------------------
+           1 row in set
+</programlisting>
+
+      <para>
+        In this example, user A sees the row inserted by B only when B
+        has committed the insert and A has committed as well, so that
+        the timepoint is advanced past the commit of B.
+      </para>
+
+      <para>
+        If you want to see the <quote>freshest</quote> state of the
+        database, you should use either the <literal>READ
+        COMMITTED</literal> isolation level or a locking read:
+      </para>
+
+<programlisting>
+SELECT * FROM t LOCK IN SHARE MODE;
+</programlisting>
+
+    </section>
+
+    <section id="innodb-locks-set">
+
+      <title>Locks Set by Different SQL Statements in
<literal>InnoDB</literal></title>
+
+      <para>
+        A locking read, an <literal>UPDATE</literal>, or a
+        <literal>DELETE</literal> generally set record locks on every
+        index record that is scanned in the processing of the SQL
+        statement. It does not matter if there are
+        <literal>WHERE</literal> conditions in the statement that would
+        exclude the row. <literal>InnoDB</literal> does not remember the
+        exact <literal>WHERE</literal> condition, but only knows which
+        index ranges were scanned. The record locks are normally
+        next-key locks that also block inserts to the <quote>gap</quote>
+        immediately before the record.
+      </para>
+
+      <para>
+        If the locks to be set are exclusive, <literal>InnoDB</literal>
+        always retrieves also the clustered index record and sets a lock
+        on it.
+      </para>
+
+      <para>
+        If you do not have indexes suitable for your statement and MySQL
+        has to scan the whole table to process the statement, every row
+        of the table becomes locked, which in turn blocks all inserts by
+        other users to the table. It is important to create good indexes
+        so that your queries do not unnecessarily need to scan many
+        rows.
+      </para>
+
+      <para>
+        For <literal>SELECT ... FOR UPDATE</literal> or <literal>SELECT
+        ... IN SHARE MODE</literal>, locks are acquired for scanned
+        rows, and expected to be released for rows that do not qualify
+        for inclusion in the result set (for example, if they do not
+        meet the criteria given in the <literal>WHERE</literal> clause).
+        However, in some cases, rows might not be unlocked immediately
+        because the relationship between a result row and its original
+        source is lost during query execution. For example, in a
+        <literal>UNION</literal>, scanned (and locked) rows from a table
+        might be inserted into a temporary table before evaluation
+        whether they qualify for the result set. In this circumstance,
+        the relationship of the rows in the temporary table to the rows
+        in the original table is lost and the latter rows are not
+        unlocked until the end of query execution.
+      </para>
+
+      <para>
+        <literal>InnoDB</literal> sets specific types of locks as
+        follows:
+      </para>
+
+      <itemizedlist>
+
+        <listitem>
+          <para>
+            <literal>SELECT ... FROM</literal> is a consistent read,
+            reading a snapshot of the database and setting no locks
+            unless the transaction isolation level is set to
+            <literal>SERIALIZABLE</literal>. For
+            <literal>SERIALIZABLE</literal> level, this sets shared
+            next-key locks on the index records it encounters.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>SELECT ... FROM ... LOCK IN SHARE MODE</literal>
+            sets shared next-key locks on all index records the read
+            encounters.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>SELECT ... FROM ... FOR UPDATE</literal> sets
+            exclusive next-key locks on all index records the read
+            encounters.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>UPDATE ... WHERE ...</literal> sets an exclusive
+            next-key lock on every record the search encounters.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>DELETE FROM ... WHERE ...</literal> sets an
+            exclusive next-key lock on every record the search
+            encounters.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>INSERT INTO ... VALUES (...)</literal> sets an
+            exclusive lock on the inserted row. Note that this lock is
+            not a next-key lock and does not prevent other users from
+            inserting to the gap before the inserted row. If a
+            duplicate-key error occurs, a shared lock on the duplicate
+            index record is set.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>REPLACE</literal> is done like an
+            <literal>INSERT</literal> if there is no collision on a
+            unique key. Otherwise, an exclusive next-key lock is placed
+            on the row that has to be updated.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            While initializing a previously specified
+            <literal>AUTO_INCREMENT</literal> column on a table,
+            <literal>InnoDB</literal> sets an exclusive lock on the end
+            of the index associated with the
+            <literal>AUTO_INCREMENT</literal> column. In accessing the
+            auto-increment counter, <literal>InnoDB</literal> uses a
+            specific table lock mode <literal>AUTO-INC</literal> where
+            the lock lasts only to the end of the current SQL statement,
+            not to the end of the entire transaction. Note that other
+            clients cannot insert into the table while the
+            <literal>AUTO-INC</literal> table lock is held; see
+            <xref linkend="innodb-and-autocommit"/>.
+          </para>
+
+          <para>
+            Before MySQL 3.23.50, <literal>SHOW TABLE STATUS</literal>
+            applied to a table with an <literal>AUTO_INCREMENT</literal>
+            column sets an exclusive row-level lock to the high end of
+            the <literal>AUTO_INCREMENT</literal> index. This means also
+            that <literal>SHOW TABLE STATUS</literal> could cause a
+            deadlock of transactions, something that may surprise users.
+            Starting from MySQL 3.23.50, <literal>InnoDB</literal>
+            fetches the value of a previously initialized
+            <literal>AUTO_INCREMENT</literal> column without setting any
+            locks.
+          </para>
+        </listitem>
+
+        <listitem>
+          <remark role="note">
+            This behavior changes in 5.0. Don't merge the 5.0
+            description in here.
+          </remark>
+
+          <para>
+            <literal>INSERT INTO T SELECT ... FROM S WHERE ...</literal>
+            sets an exclusive (non-next-key) lock on each row inserted
+            into <literal>T</literal>. It does the search on
+            <literal>S</literal> as a consistent read, but sets shared
+            next-key locks on <literal>S</literal> if MySQL binary
+            logging is turned on. <literal>InnoDB</literal> has to set
+            locks in the latter case: In roll-forward recovery from a
+            backup, every SQL statement has to be executed in exactly
+            the same way it was done originally.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>CREATE TABLE ... SELECT ...</literal> performs the
+            <literal>SELECT</literal> as a consistent read or with
+            shared locks, as in the previous item.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            If a <literal>FOREIGN KEY</literal> constraint is defined on
+            a table, any insert, update, or delete that requires the
+            constraint condition to be checked sets shared record-level
+            locks on the records that it looks at to check the
+            constraint. <literal>InnoDB</literal> also sets these locks
+            in the case where the constraint fails.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>LOCK TABLES</literal> sets table locks, but it is
+            the higher MySQL layer above the <literal>InnoDB</literal>
+            layer that sets these locks. Beginning with MySQL 4.0.20 and
+            4.1.2, <literal>InnoDB</literal> is aware of table locks if
+            <literal>innodb_table_locks=1</literal> (the default) and
+            <literal>AUTOCOMMIT=0</literal>, and the MySQL layer above
+            <literal>InnoDB</literal> knows about row-level locks.
+            Otherwise, InnoDB's automatic deadlock detection cannot
+            detect deadlocks where such table locks are involved. Also,
+            because the higher MySQL layer does not know about row-level
+            locks, it is possible to get a table lock on a table where
+            another user currently has row-level locks. However, this
+            does not endanger transaction integrity, as discussed in
+            <xref linkend="innodb-deadlock-detection"/>. See also
+            <xref linkend="innodb-restrictions"/>.
+          </para>
+        </listitem>
+
+      </itemizedlist>
+
+    </section>
+
+    <section id="innodb-implicit-commit">
+
+      <title>Implicit Transaction Commit and Rollback</title>
+
+      <para>
+        By default, MySQL begins each client connection with autocommit
+        mode enabled. When autocommit is enabled, MySQL does a commit
+        after each SQL statement if that statement did not return an
+        error. If an SQL statement returns an error, the commit or
+        rollback behavior depends on the error. See
+        <xref linkend="innodb-error-handling"/>.
+      </para>
+
+      <para>
+        If you have the autocommit mode off and close a connection
+        without explicitly committing the final transaction, MySQL rolls
+        back that transaction.
+      </para>
+
+      <para>
+        For details about which statements implicitly end a transaction,
+        as if you had done a <literal>COMMIT</literal> before executing
+        the statement, see <xref linkend="implicit-commit"/>.
+      </para>
+
+    </section>
+
+    <section id="innodb-deadlock-detection">
+
+      <title>Deadlock Detection and Rollback</title>
+
+      <para>
+        <literal>InnoDB</literal> automatically detects a deadlock of
+        transactions and rolls back a transaction or transactions to
+        break the deadlock. Starting from MySQL 4.0.5,
+        <literal>InnoDB</literal> tries to pick small transactions to
+        roll back, the size of a transaction being determined by the
+        number of rows inserted, updated, or deleted. Prior to 4.0.5,
+        <literal>InnoDB</literal> always rolled back the transaction
+        whose lock request was the last one to build a deadlock, that
+        is, a cycle in the <quote>waits-for</quote> graph of
+        transactions.
+      </para>
+
+      <para>
+        Beginning with MySQL 4.0.20 and 4.1.2, <literal>InnoDB</literal>
+        is aware of table locks if
+        <literal>innodb_table_locks=1</literal> (the default) and
+        <literal>AUTOCOMMIT=0</literal>, and the MySQL layer above
+        <literal>InnoDB</literal> knows about row-level locks. Before
+        that, <literal>InnoDB</literal> cannot detect deadlocks where a
+        table lock set by a MySQL <literal>LOCK TABLES</literal>
+        statement is involved, or if a lock set by another storage
+        engine than <literal>InnoDB</literal> is involved. You have to
+        resolve these situations by setting the value of the
+        <literal>innodb_lock_wait_timeout</literal> system variable.
+      </para>
+
+      <para>
+        When <literal>InnoDB</literal> performs a complete rollback of a
+        transaction, all locks set by the transaction are released.
+        However, if just a single SQL statement is rolled back as a
+        result of an error, some of the locks set by the statement may
+        be preserved. This happens because <literal>InnoDB</literal>
+        stores row locks in a format such that it cannot know afterward
+        which lock was set by which statement.
+      </para>
+
+    </section>
+
+    <section id="innodb-deadlocks">
+
+      <title>How to Cope with Deadlocks</title>
+
+      <para>
+        Deadlocks are a classic problem in transactional databases, but
+        they are not dangerous unless they are so frequent that you
+        cannot run certain transactions at all. Normally, you must write
+        your applications so that they are always prepared to re-issue a
+        transaction if it gets rolled back because of a deadlock.
+      </para>
+
+      <para>
+        <literal>InnoDB</literal> uses automatic row-level locking. You
+        can get deadlocks even in the case of transactions that just
+        insert or delete a single row. That is because these operations
+        are not really <quote>atomic</quote>; they automatically set
+        locks on the (possibly several) index records of the row
+        inserted or deleted.
+      </para>
+
+      <para>
+        You can cope with deadlocks and reduce the likelihood of their
+        occurrence with the following techniques:
+      </para>
+
+      <itemizedlist>
+
+        <listitem>
+          <para>
+            Use <literal>SHOW INNODB STATUS</literal> to determine the
+            cause of the latest deadlock. That can help you to tune your
+            application to avoid deadlocks. This strategy can be used as
+            of MySQL 3.23.52 and 4.0.3, depending on your MySQL series.
+            From 4.1.2 on, use <literal>SHOW ENGINE INNODB
+            STATUS</literal>.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            Always be prepared to re-issue a transaction if it fails due
+            to deadlock. Deadlocks are not dangerous. Just try again.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            Commit your transactions often. Small transactions are less
+            prone to collision.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            If you are using locking reads (<literal>SELECT ... FOR
+            UPDATE</literal> or <literal>... LOCK IN SHARE
+            MODE</literal>), try using a lower isolation level such as
+            <literal>READ COMMITTED</literal>.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            Access your tables and rows in a fixed order. Then
+            transactions form well-defined queues and do not deadlock.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            Add well-chosen indexes to your tables. Then your queries
+            need to scan fewer index records and consequently set fewer
+            locks. Use <literal>EXPLAIN SELECT</literal> to determine
+            which indexes the MySQL server regards as the most
+            appropriate for your queries.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            Use less locking. If you can afford to allow a
+            <literal>SELECT</literal> to return data from an old
+            snapshot, do not add the clause <literal>FOR
+            UPDATE</literal> or <literal>LOCK IN SHARE MODE</literal>
to
+            it. Using the <literal>READ COMMITTED</literal> isolation
+            level is good here, because each consistent read within the
+            same transaction reads from its own fresh snapshot.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            If nothing else helps, serialize your transactions with
+            table-level locks. The correct way to use <literal>LOCK
+            TABLES</literal> with transactional tables, such as
+            <literal>InnoDB</literal> tables, is to set
+            <literal>AUTOCOMMIT = 0</literal> and not to call
+            <literal>UNLOCK TABLES</literal> until after you commit the
+            transaction explicitly. For example, if you need to write to
+            table <literal>t1</literal> and read from table
+            <literal>t2</literal>, you can do this:
+          </para>
+
+<programlisting>
+SET AUTOCOMMIT=0;
+LOCK TABLES t1 WRITE, t2 READ, ...;
+<replaceable>... do something with tables t1 and t2 here ...</replaceable>
+COMMIT;
+UNLOCK TABLES;
+</programlisting>
+
+          <para>
+            Table-level locks make your transactions queue nicely, and
+            deadlocks are avoided.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            Another way to serialize transactions is to create an
+            auxiliary <quote>semaphore</quote> table that contains just
+            a single row. Have each transaction update that row before
+            accessing other tables. In that way, all transactions happen
+            in a serial fashion. Note that the <literal>InnoDB</literal>
+            instant deadlock detection algorithm also works in this
+            case, because the serializing lock is a row-level lock. With
+            MySQL table-level locks, the timeout method must be used to
+            resolve deadlocks.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            In applications that use <literal>AUTOCOMMIT=1</literal> and
+            MySQL's <literal>LOCK TABLES</literal> command,
+            <literal>InnoDB</literal>'s internal table locks that were
+            present from 4.0.20 to 4.0.23 can cause deadlocks. Starting
+            from 4.0.22, you can set
+            <literal>innodb_table_locks=0</literal> in
+            <filename>my.cnf</filename> to fall back to the old behavior
+            and remove the problem. 4.0.24 does not set
+            <literal>InnoDB</literal> table locks if
+            <literal>AUTOCOMMIT=1</literal>.
+          </para>
+        </listitem>
+
+      </itemizedlist>
+
+    </section>
+
+  </section>
+
+  <section id="innodb-tuning">
+
+    <title><literal>InnoDB</literal> Performance Tuning
Tips</title>
+
+    <indexterm>
+      <primary>Solaris x86_64 issues</primary>
+    </indexterm>
+
+    <itemizedlist>
+
+      <listitem>
+        <para>
+          In <literal>InnoDB</literal>, having a long <literal>PRIMARY
+          KEY</literal> wastes a lot of disk space because its value
+          must be stored with every secondary index record. (See
+          <xref linkend="innodb-table-and-index"/>.) Create an
+          <literal>AUTO_INCREMENT</literal> column as the primary key if
+          your primary key is long.
+        </para>
+      </listitem>
+
+      <listitem>
+        <para>
+          If the Unix <filename>top</filename> tool or the Windows Task
+          Manager shows that the CPU usage percentage with your workload
+          is less than 70%, your workload is probably disk-bound. Maybe
+          you are making too many transaction commits, or the buffer
+          pool is too small. Making the buffer pool bigger can help, but
+          do not set it equal to more than 80% of physical memory.
+        </para>
+      </listitem>
+
+      <listitem>
+        <para>
+          Wrap several modifications into one transaction.
+          <literal>InnoDB</literal> must flush the log to disk at each
+          transaction commit if that transaction made modifications to
+          the database. The rotation speed of a disk is typically at
+          most 167 revolutions/second, which constrains the number of
+          commits to the same 167<superscript>th</superscript> of a
+          second if the disk does not <quote>fool</quote> the operating
+          system.
+        </para>
+      </listitem>
+
+      <listitem>
+        <para>
+          If you can afford the loss of some of the latest committed
+          transactions if a crash occurs, you can set the
+          <literal>innodb_flush_log_at_trx_commit</literal> parameter to
+          0. <literal>InnoDB</literal> tries to flush the log once per
+          second anyway, although the flush is not guaranteed.
+        </para>
+      </listitem>
+
+      <listitem>
+        <para>
+          Make your log files big, even as big as the buffer pool. When
+          <literal>InnoDB</literal> has written the log files full, it
+          has to write the modified contents of the buffer pool to disk
+          in a checkpoint. Small log files cause many unnecessary disk
+          writes. The drawback of big log files is that the recovery
+          time is longer.
+        </para>
+      </listitem>
+
+      <listitem>
+        <para>
+          Make the log buffer quite large as well (on the order of 8MB).
+        </para>
+      </listitem>
+
+      <listitem>
+        <para>
+          Use the <literal>VARCHAR</literal> data type instead of
+          <literal>CHAR</literal> if you are storing variable-length
+          strings or if the column may contain many
+          <literal>NULL</literal> values. A
+          <literal>CHAR(<replaceable>N</replaceable>)</literal>
column
+          always takes <replaceable>N</replaceable> characters to store
+          data, even if the string is shorter or its value is
+          <literal>NULL</literal>. Smaller tables fit better in the
+          buffer pool and reduce disk I/O.
+        </para>
+      </listitem>
+
+      <listitem>
+        <para>
+          (Relevant from 3.23.39 up.) In some versions of GNU/Linux and
+          Unix, flushing files to disk with the Unix
+          <literal>fsync()</literal> call (which
+          <literal>InnoDB</literal> uses by default) and other similar
+          methods is surprisingly slow. If you are dissatisfied with
+          database write performance, you might try setting the
+          <literal>innodb_flush_method</literal> parameter to
+          <literal>O_DSYNC</literal>. Although
+          <literal>O_DSYNC</literal> seems to be slower on most systems,
+          yours might not be one of them.
+        </para>
+      </listitem>
+
+      <listitem>
+        <para>
+          (Verified using MySQL 4.1, assumed for other MySQL versions,
+          given that this is a platform architecture issue.) When using
+          the <literal>InnoDB</literal> storage engine on Solaris 10 for
+          x86_64 architecture (AMD Opteron), it is important to mount
+          any filesystems used for storing
+          <literal>InnoDB</literal>-related files using the
+          <literal>forcedirectio</literal> option. (The default on
+          Solaris 10/x86_64 is <emphasis>not</emphasis> to use this
+          option.) Failure to use <literal>forcedirectio</literal>
+          causes a serious degradation of <literal>InnoDB</literal>'s
+          speed and performance on this platform.
+        </para>
+
+        <para>
+          When using the <literal>InnoDB</literal> storage engine with a
+          large <literal>innodb_buffer_pool_size</literal> value on any
+          release of Solaris 2.6 and up and any platform
+          (sparc/x86/x64/amd64), a significant performance gain can be
+          achieved by placing <literal>InnoDB</literal> data files and
+          log files on raw devices or on a separate direct I/O UFS
+          filesystem (using mount option
+          <literal>forcedirectio</literal>; see
+          <literal>mount_ufs(1M)</literal>). Users of the Veritas
+          filesystem VxFS should use the mount option
+          <literal>convosync=direct</literal>.
+        </para>
+
+        <para>
+          Other MySQL data files, such as those for
+          <literal>MyISAM</literal> tables, should not be placed on a
+          direct I/O filesystem. Executables or libraries <emphasis>must
+          not</emphasis> be placed on a direct I/O filesystem.
+        </para>
+      </listitem>
+
+      <listitem>
+        <para>
+          When importing data into <literal>InnoDB</literal>, make sure
+          that MySQL does not have autocommit mode enabled because that
+          requires a log flush to disk for every insert. To disable
+          autocommit during your import operation, surround it with
+          <literal>SET AUTOCOMMIT</literal> and
+          <literal>COMMIT</literal> statements:
+        </para>
+
+<programlisting>
+SET AUTOCOMMIT=0;
+<replaceable>... SQL import statements ...</replaceable>
+COMMIT;
+</programlisting>
+
+        <para>
+          If you use the <command>mysqldump</command> option
+          <option>--opt</option>, you get dump files that are fast to
+          import into an <literal>InnoDB</literal> table, even without
+          wrapping them with the <literal>SET AUTOCOMMIT</literal> and
+          <literal>COMMIT</literal> statements.
+        </para>
+      </listitem>
+
+      <listitem>
+        <para>
+          Beware of big rollbacks of mass inserts:
+          <literal>InnoDB</literal> uses the insert buffer to save disk
+          I/O in inserts, but no such mechanism is used in a
+          corresponding rollback. A disk-bound rollback can take 30
+          times as long to perform as the corresponding insert. Killing
+          the database process does not help because the rollback starts
+          again on server startup. The only way to get rid of a runaway
+          rollback is to increase the buffer pool so that the rollback
+          becomes CPU-bound and runs fast, or to use a special
+          procedure. See <xref linkend="forcing-recovery"/>.
+        </para>
+      </listitem>
+
+      <listitem>
+        <para>
+          Beware also of other big disk-bound operations. Use
+          <literal>DROP TABLE</literal> and <literal>CREATE
+          TABLE</literal> to empty a table, not <literal>DELETE FROM
+          <replaceable>tbl_name</replaceable></literal>.
+        </para>
+      </listitem>
+
+      <listitem>
+        <para>
+          Use the multiple-row <literal>INSERT</literal> syntax to
+          reduce communication overhead between the client and the
+          server if you need to insert many rows:
+        </para>
+
+<programlisting>
+INSERT INTO yourtable VALUES (1,2), (5,5), ...;
+</programlisting>
+
+        <para>
+          This tip is valid for inserts into any table, not just
+          <literal>InnoDB</literal> tables.
+        </para>
+      </listitem>
+
+      <listitem>
+        <para>
+          If you have <literal>UNIQUE</literal> constraints on secondary
+          keys, starting from MySQL 3.23.52 and 4.0.3, you can speed up
+          table imports by temporarily turning off the uniqueness checks
+          during the import session:
+        </para>
+
+<programlisting>
+SET UNIQUE_CHECKS=0;
+<replaceable>... import operation ...</replaceable>
+SET UNIQUE_CHECKS=1;
+</programlisting>
+
+        <para>
+          For big tables, this saves a lot of disk I/O because
+          <literal>InnoDB</literal> can use its insert buffer to write
+          secondary index records in a batch. Be certain that the data
+          contains no duplicate keys. <literal>UNIQUE_CHECKS</literal>
+          allows but does not require storage engines to ignore
+          duplicate keys.
+        </para>
+      </listitem>
+
+      <listitem>
+        <para>
+          If you have <literal>FOREIGN KEY</literal> constraints in your
+          tables, starting from MySQL 3.23.52 and 4.0.3, you can speed
+          up table imports by turning the foreign key checks off for a
+          while in the import session:
+        </para>
+
+<programlisting>
+SET FOREIGN_KEY_CHECKS=0;
+<replaceable>... import operation ...</replaceable>
+SET FOREIGN_KEY_CHECKS=1;
+</programlisting>
+
+        <para>
+          For big tables, this can save a lot of disk I/O.
+        </para>
+      </listitem>
+
+      <listitem>
+        <para>
+          If you often have recurring queries for tables that are not
+          updated frequently, use the query cache available as of MySQL
+          4.0:
+        </para>
+
+<programlisting>
+[mysqld]
+query_cache_type = ON
+query_cache_size = 10M
+</programlisting>
+
+        <para>
+          In MySQL 4.0, the query cache works only with autocommit
+          enabled. This restriction is removed in MySQL 4.1.1 and up.
+        </para>
+      </listitem>
+
+      <listitem>
+        <para>
+          Unlike <literal>MyISAM</literal>,
<literal>InnoDB</literal>
+          does not store an index cardinality value in its tables.
+          Instead, <literal>InnoDB</literal> computes a cardinality for
+          a table the first time it accesses it after startup. With a
+          large number of tables, this might take significant time. It
+          is the initial table open operation that is important, so to
+          <quote>warm up</quote> a table for later use, you might want
+          to use it immediately after start up by issuing a statement
+          such as <literal>SELECT 1 FROM
+          <replaceable>tbl_name</replaceable> LIMIT 1</literal>.
+        </para>
+      </listitem>
+
+    </itemizedlist>
+
+    <formalpara role="mnmas">
+
+      <title>MySQL Enterprise</title>
+
+      <para>
+        For optimization recommendations geared to your specific
+        circumstances subscribe to the MySQL Enterprise Monitor. For
+        more information see,
+        <ulink url="&base-url-enterprise;advisors.html"/>.
+      </para>
+
+    </formalpara>
+
+    <section id="innodb-monitor">
+
+      <title><literal>SHOW ENGINE INNODB STATUS</literal> and the
+        <literal>InnoDB</literal> Monitors</title>
+
+      <para>
+        Starting from MySQL 3.23.42, <literal>InnoDB</literal> includes
+        <literal>InnoDB</literal> Monitors that print information about
+        the <literal>InnoDB</literal> internal state. Starting from
+        MySQL 3.23.52 and 4.0.3, you can use the <literal>SHOW INNODB
+        STATUS</literal> SQL statement at any time to fetch the output
+        of the standard <literal>InnoDB</literal> Monitor to your SQL
+        client. (In MySQL 4.1.2 and up, the statement is <literal>SHOW
+        ENGINE INNODB STATUS</literal>.) The information is useful in
+        performance tuning. If you are using the
+        <command>mysql</command> interactive SQL client, the output is
+        more readable if you replace the usual semicolon statement
+        terminator by <literal>\G</literal>: For a discussion about the
+        <literal>InnoDB</literal> lock modes, see
+        <xref linkend="innodb-lock-modes"/>.
+      </para>
+
+<programlisting>
+mysql&gt; <userinput>SHOW INNODB STATUS\G</userinput>
+</programlisting>
+
+      <para>
+        Another way to use <literal>InnoDB</literal> Monitors is to let
+        them periodically write data to the standard output of the
+        <command>mysqld</command> server. In this case, no output is
+        sent to clients. When switched on, <literal>InnoDB</literal>
+        Monitors print data about every 15 seconds. Server output
+        usually is directed to the <filename>.err</filename> log in the
+        MySQL data directory. This data is useful in performance tuning.
+        On Windows, you must start the server from a command prompt in a
+        console window with the <option>--console</option> option if you
+        want to direct the output to the window rather than to the error
+        log.
+      </para>
+
+      <para>
+        Monitor output includes the following types of information:
+      </para>
+
+      <itemizedlist>
+
+        <listitem>
+          <para>
+            Table and record locks held by each active transaction
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            Lock waits of a transactions
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            Semaphore waits of threads
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            Pending file I/O requests
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            Buffer pool statistics
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            Purge and insert buffer merge activity of the main
+            <literal>InnoDB</literal> thread
+          </para>
+        </listitem>
+
+      </itemizedlist>
+
+      <para>
+        To cause the standard <literal>InnoDB</literal> Monitor to write
+        to the standard output of <command>mysqld</command>, use the
+        following SQL statement:
+      </para>
+
+<programlisting>
+CREATE TABLE innodb_monitor (a INT) TYPE=InnoDB;
+</programlisting>
+
+      <para>
+        The monitor can be stopped by issuing the following statement:
+      </para>
+
+<programlisting>
+DROP TABLE innodb_monitor;
+</programlisting>
+
+      <para>
+        The <literal>CREATE TABLE</literal> syntax is just a way to pass
+        a command to the <literal>InnoDB</literal> engine through
+        MySQL's SQL parser: The only things that matter are the table
+        name <literal>innodb_monitor</literal> and that it be an
+        <literal>InnoDB</literal> table. The structure of the table is
+        not relevant at all for the <literal>InnoDB</literal> Monitor.
+        If you shut down the server, the monitor does not restart
+        automatically when you restart the server. You must drop the
+        monitor table and issue a new <literal>CREATE TABLE</literal>
+        statement to start the monitor. (This syntax may change in a
+        future release.)
+      </para>
+
+      <para>
+        In a similar way, you can start
+        <literal>innodb_lock_monitor</literal>, which is otherwise the
+        same as <literal>innodb_monitor</literal> but also prints a lot
+        of lock information. A separate
+        <literal>innodb_tablespace_monitor</literal> prints a list of
+        created file segments existing in the tablespace and also
+        validates the tablespace allocation data structures. Starting
+        from 3.23.44, there is <literal>innodb_table_monitor</literal>
+        with which you can print the contents of the
+        <literal>InnoDB</literal> internal data dictionary.
+      </para>
+
+      <para>
+        A sample of <literal>InnoDB</literal> Monitor output:
+      </para>
+
+<programlisting>
+mysql&gt; <userinput>SHOW INNODB STATUS\G</userinput>
+*************************** 1. row ***************************
+Status:
+=====================================
+030709 13:00:59 INNODB MONITOR OUTPUT
+=====================================
+Per second averages calculated from the last 18 seconds
+----------
+SEMAPHORES
+----------
+OS WAIT ARRAY INFO: reservation count 413452, signal count 378357
+--Thread 32782 has waited at btr0sea.c line 1477 for 0.00 seconds the
+semaphore: X-lock on RW-latch at 41a28668 created in file btr0sea.c line 135
+a writer (thread id 32782) has reserved it in mode wait exclusive
+number of readers 1, waiters flag 1
+Last time read locked in file btr0sea.c line 731
+Last time write locked in file btr0sea.c line 1347
+Mutex spin waits 0, rounds 0, OS waits 0
+RW-shared spins 108462, OS waits 37964; RW-excl spins 681824, OS waits
+375485
+------------------------
+LATEST FOREIGN KEY ERROR
+------------------------
+030709 13:00:59 Transaction:
+TRANSACTION 0 290328284, ACTIVE 0 sec, process no 3195, OS thread id 34831
+inserting
+15 lock struct(s), heap size 2496, undo log entries 9
+MySQL thread id 25, query id 4668733 localhost heikki update
+insert into ibtest11a (D, B, C) values (5, 'khDk' ,'khDk')
+Foreign key constraint fails for table test/ibtest11a:
+,
+  CONSTRAINT `0_219242` FOREIGN KEY (`A`, `D`) REFERENCES `ibtest11b` (`A`,
+  `D`) ON DELETE CASCADE ON UPDATE CASCADE
+Trying to add in child table, in index PRIMARY tuple:
+ 0: len 4; hex 80000101; asc ....;; 1: len 4; hex 80000005; asc ....;; 2:
+ len 4; hex 6b68446b; asc khDk;; 3: len 6; hex 0000114e0edc; asc ...N..;; 4:
+ len 7; hex 00000000c3e0a7; asc .......;; 5: len 4; hex 6b68446b; asc khDk;;
+But in parent table test/ibtest11b, in index PRIMARY,
+the closest match we can find is record:
+RECORD: info bits 0 0: len 4; hex 8000015b; asc ...[;; 1: len 4; hex
+80000005; asc ....;; 2: len 3; hex 6b6864; asc khd;; 3: len 6; hex
+0000111ef3eb; asc ......;; 4: len 7; hex 800001001e0084; asc .......;; 5:
+len 3; hex 6b6864; asc khd;;
+------------------------
+LATEST DETECTED DEADLOCK
+------------------------
+030709 12:59:58
+*** (1) TRANSACTION:
+TRANSACTION 0 290252780, ACTIVE 1 sec, process no 3185, OS thread id 30733
+inserting
+LOCK WAIT 3 lock struct(s), heap size 320, undo log entries 146
+MySQL thread id 21, query id 4553379 localhost heikki update
+INSERT INTO alex1 VALUES(86, 86, 794,'aA35818','bb','c79166','d4766t',
+'e187358f','g84586','h794',date_format('2001-04-03 12:54:22','%Y-%m-%d
+%H:%i'),7
+*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
+RECORD LOCKS space id 0 page no 48310 n bits 568 table test/alex1 index
+symbole trx id 0 290252780 lock mode S waiting
+Record lock, heap no 324 RECORD: info bits 0 0: len 7; hex 61613335383138;
+asc aa35818;; 1:
+*** (2) TRANSACTION:
+TRANSACTION 0 290251546, ACTIVE 2 sec, process no 3190, OS thread id 32782
+inserting
+130 lock struct(s), heap size 11584, undo log entries 437
+MySQL thread id 23, query id 4554396 localhost heikki update
+REPLACE INTO alex1 VALUES(NULL, 32, NULL,'aa3572','','c3572','d6012t','',
+NULL,'h396', NULL, NULL, 7.31,7.31,7.31,200)
+*** (2) HOLDS THE LOCK(S):
+RECORD LOCKS space id 0 page no 48310 n bits 568 table test/alex1 index
+symbole trx id 0 290251546 lock_mode X locks rec but not gap
+Record lock, heap no 324 RECORD: info bits 0 0: len 7; hex 61613335383138;
+asc aa35818;; 1:
+*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
+RECORD LOCKS space id 0 page no 48310 n bits 568 table test/alex1 index
+symbole trx id 0 290251546 lock_mode X locks gap before rec insert intention
+waiting
+Record lock, heap no 82 RECORD: info bits 0 0: len 7; hex 61613335373230;
+asc aa35720;; 1:
+*** WE ROLL BACK TRANSACTION (1)
+------------
+TRANSACTIONS
+------------
+Trx id counter 0 290328385
+Purge done for trx's n:o &lt; 0 290315608 undo n:o &lt; 0 17
+Total number of lock structs in row lock hash table 70
+LIST OF TRANSACTIONS FOR EACH SESSION:
+---TRANSACTION 0 0, not started, process no 3491, OS thread id 42002
+MySQL thread id 32, query id 4668737 localhost heikki
+show innodb status
+---TRANSACTION 0 290328384, ACTIVE 0 sec, process no 3205, OS thread id
+38929 inserting
+1 lock struct(s), heap size 320
+MySQL thread id 29, query id 4668736 localhost heikki update
+insert into speedc values (1519229,1, 'hgjhjgghggjgjgjgjgjggjgjgjgjgjgggjgjg
+jlhhgghggggghhjhghgggggghjhghghghghghhhhghghghjhhjghjghjkghjghjghjghjfhjfh
+---TRANSACTION 0 290328383, ACTIVE 0 sec, process no 3180, OS thread id
+28684 committing
+1 lock struct(s), heap size 320, undo log entries 1
+MySQL thread id 19, query id 4668734 localhost heikki update
+insert into speedcm values (1603393,1, 'hgjhjgghggjgjgjgjgjggjgjgjgjgjgggjgj
+gjlhhgghggggghhjhghgggggghjhghghghghghhhhghghghjhhjghjghjkghjghjghjghjfhjf
+---TRANSACTION 0 290328327, ACTIVE 0 sec, process no 3200, OS thread id
+36880 starting index read
+LOCK WAIT 2 lock struct(s), heap size 320
+MySQL thread id 27, query id 4668644 localhost heikki Searching rows for
+update
+update ibtest11a set B = 'kHdkkkk' where A = 89572
+------- TRX HAS BEEN WAITING 0 SEC FOR THIS LOCK TO BE GRANTED:
+RECORD LOCKS space id 0 page no 65556 n bits 232 table test/ibtest11a index
+PRIMARY trx id 0 290328327 lock_mode X waiting
+Record lock, heap no 1 RECORD: info bits 0 0: len 9; hex 73757072656d756d00;
+asc supremum.;;
+------------------
+---TRANSACTION 0 290328284, ACTIVE 0 sec, process no 3195, OS thread id
+34831 rollback of SQL statement
+ROLLING BACK 14 lock struct(s), heap size 2496, undo log entries 9
+MySQL thread id 25, query id 4668733 localhost heikki update
+insert into ibtest11a (D, B, C) values (5, 'khDk' ,'khDk')
+---TRANSACTION 0 290327208, ACTIVE 1 sec, process no 3190, OS thread id
+32782
+58 lock struct(s), heap size 5504, undo log entries 159
+MySQL thread id 23, query id 4668732 localhost heikki update
+REPLACE INTO alex1 VALUES(86, 46, 538,'aa95666','bb','c95666','d9486t',
+'e200498f','g86814','h538',date_format('2001-04-03 12:54:22','%Y-%m-%d
+%H:%i'),
+---TRANSACTION 0 290323325, ACTIVE 3 sec, process no 3185, OS thread id
+30733 inserting
+4 lock struct(s), heap size 1024, undo log entries 165
+MySQL thread id 21, query id 4668735 localhost heikki update
+INSERT INTO alex1 VALUES(NULL, 49, NULL,'aa42837','','c56319','d1719t','',
+NULL,'h321', NULL, NULL, 7.31,7.31,7.31,200)
+--------
+FILE I/O
+--------
+I/O thread 0 state: waiting for i/o request (insert buffer thread)
+I/O thread 1 state: waiting for i/o request (log thread)
+I/O thread 2 state: waiting for i/o request (read thread)
+I/O thread 3 state: waiting for i/o request (write thread)
+Pending normal aio reads: 0, aio writes: 0,
+ ibuf aio reads: 0, log i/o's: 0, sync i/o's: 0
+Pending flushes (fsync) log: 0; buffer pool: 0
+151671 OS file reads, 94747 OS file writes, 8750 OS fsyncs
+25.44 reads/s, 18494 avg bytes/read, 17.55 writes/s, 2.33 fsyncs/s
+-------------------------------------
+INSERT BUFFER AND ADAPTIVE HASH INDEX
+-------------------------------------
+Ibuf for space 0: size 1, free list len 19, seg size 21,
+85004 inserts, 85004 merged recs, 26669 merges
+Hash table size 207619, used cells 14461, node heap has 16 buffer(s)
+1877.67 hash searches/s, 5121.10 non-hash searches/s
+---
+LOG
+---
+Log sequence number 18 1212842764
+Log flushed up to   18 1212665295
+Last checkpoint at  18 1135877290
+0 pending log writes, 0 pending chkp writes
+4341 log i/o's done, 1.22 log i/o's/second
+----------------------
+BUFFER POOL AND MEMORY
+----------------------
+Total memory allocated 84966343; in additional pool allocated 1402624
+Buffer pool size   3200
+Free buffers       110
+Database pages     3074
+Modified db pages  2674
+Pending reads 0
+Pending writes: LRU 0, flush list 0, single page 0
+Pages read 171380, created 51968, written 194688
+28.72 reads/s, 20.72 creates/s, 47.55 writes/s
+Buffer pool hit rate 999 / 1000
+--------------
+ROW OPERATIONS
+--------------
+0 queries inside InnoDB, 0 queries in queue
+Main thread process no. 3004, id 7176, state: purging
+Number of rows inserted 3738558, updated 127415, deleted 33707, read 755779
+1586.13 inserts/s, 50.89 updates/s, 28.44 deletes/s, 107.88 reads/s
+----------------------------
+END OF INNODB MONITOR OUTPUT
+============================
+</programlisting>
+
+      <para>
+        Some notes on the output:
+      </para>
+
+      <itemizedlist>
+
+        <listitem>
+          <para>
+            If the <literal>TRANSACTIONS</literal> section reports lock
+            waits, your applications may have lock contention. The
+            output can also help to trace the reasons for transaction
+            deadlocks.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            The <literal>SEMAPHORES</literal> section reports threads
+            waiting for a semaphore and statistics on how many times
+            threads have needed a spin or a wait on a mutex or a rw-lock
+            semaphore. A large number of threads waiting for semaphores
+            may be a result of disk I/O, or contention problems inside
+            <literal>InnoDB</literal>. Contention can be due to heavy
+            parallelism of queries or problems in operating system
+            thread scheduling. Setting
+            <literal>innodb_thread_concurrency</literal> smaller than
+            the default value can help in such situations.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            The <literal>BUFFER POOL AND MEMORY</literal> section gives
+            you statistics on pages read and written. You can calculate
+            from these numbers how many data file I/O operations your
+            queries currently are doing.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            The <literal>ROW OPERATIONS</literal> section shows what the
+            main thread is doing.
+          </para>
+        </listitem>
+
+      </itemizedlist>
+
+      <para>
+        Beginning with MySQL 4.0.19, <literal>InnoDB</literal> sends
+        diagnostic output to stderr or files instead of stdout or
+        fixed-size memory buffers, to avoid potential buffer overflow
+        errors. As a side effect, the output of <literal>SHOW INNODB
+        STATUS</literal> is written to a status file in the MySQL data
+        directory every fifteen seconds. The name of the file is
+       
<filename>innodb_status.<replaceable>pid</replaceable></filename>,
+        where <replaceable>pid</replaceable> is the server process ID.
+        <literal>InnoDB</literal> removes the file for a normal
+        shutdown. If abnormal shutdowns have occurred, instances of
+        these status files may be present and must be removed manually.
+        Before removing them, you might want to examine them to see
+        whether they contain useful information about the cause of
+        abnormal shutdowns. Beginning with MySQL 4.0.21, the
+       
<filename>innodb_status.<replaceable>pid</replaceable></filename>
+        file is created only if the configuration option
+        <literal>innodb_status_file=1</literal> is set.
+      </para>
+
+    </section>
+
+  </section>
+
+  <section id="innodb-multi-versioning">
+
+    <title>Implementation of Multi-Versioning</title>
+
+    <para>
+      Because <literal>InnoDB</literal> is a multi-versioned storage
+      engine, it must keep information about old versions of rows in the
+      tablespace. This information is stored in a data structure called
+      a <firstterm>rollback segment</firstterm> (after an analogous data
+      structure in Oracle).
+    </para>
+
+    <para>
+      Internally, <literal>InnoDB</literal> adds two fields to each row
+      stored in the database. A 6-byte field indicates the transaction
+      identifier for the last transaction that inserted or updated the
+      row. Also, a deletion is treated internally as an update where a
+      special bit in the row is set to mark it as deleted. Each row also
+      contains a 7-byte field called the roll pointer. The roll pointer
+      points to an undo log record written to the rollback segment. If
+      the row was updated, the undo log record contains the information
+      necessary to rebuild the content of the row before it was updated.
+    </para>
+
+    <para>
+      <literal>InnoDB</literal> uses the information in the rollback
+      segment to perform the undo operations needed in a transaction
+      rollback. It also uses the information to build earlier versions
+      of a row for a consistent read.
+    </para>
+
+    <para>
+      Undo logs in the rollback segment are divided into insert and
+      update undo logs. Insert undo logs are needed only in transaction
+      rollback and can be discarded as soon as the transaction commits.
+      Update undo logs are used also in consistent reads, but they can
+      be discarded only after there is no transaction present for which
+      <literal>InnoDB</literal> has assigned a snapshot that in a
+      consistent read could need the information in the update undo log
+      to build an earlier version of a database row.
+    </para>
+
+    <para>
+      You must remember to commit your transactions regularly, including
+      those transactions that issue only consistent reads. Otherwise,
+      <literal>InnoDB</literal> cannot discard data from the update undo
+      logs, and the rollback segment may grow too big, filling up your
+      tablespace.
+    </para>
+
+    <para>
+      The physical size of an undo log record in the rollback segment is
+      typically smaller than the corresponding inserted or updated row.
+      You can use this information to calculate the space need for your
+      rollback segment.
+    </para>
+
+    <para>
+      In the <literal>InnoDB</literal> multi-versioning scheme, a row is
+      not physically removed from the database immediately when you
+      delete it with an SQL statement. Only when
+      <literal>InnoDB</literal> can discard the update undo log record
+      written for the deletion can it also physically remove the
+      corresponding row and its index records from the database. This
+      removal operation is called a purge, and it is quite fast, usually
+      taking the same order of time as the SQL statement that did the
+      deletion.
+    </para>
+
+    <para>
+      In a scenario where the user inserts and deletes rows in smallish
+      batches at about the same rate in the table, it is possible that
+      the purge thread starts to lag behind, and the table grows bigger
+      and bigger, making everything disk-bound and very slow. Even if
+      the table would carry just 10MB of useful data, it may grow to
+      occupy 10GB with all the dead rows. In such a case, it would be
+      good to throttle new row operations, and allocate more resources
+      for the purge thread. Starting with MySQL 4.0.22 and 4.1.6, the
+      <literal>innodb_max_purge_lag</literal> system variable exists for
+      exactly this purpose. See <xref linkend="innodb-parameters"/>, for
+      more information.
+    </para>
+
+  </section>
+
+  <section id="innodb-table-and-index">
+
+    <title><literal>InnoDB</literal> Table and Index
Structures</title>
+
+    <para>
+      MySQL stores its data dictionary information for tables in
+      <filename>.frm</filename> files in database directories. This is
+      true for all MySQL storage engines. But every
+      <literal>InnoDB</literal> table also has its own entry in the
+      <literal>InnoDB</literal> internal data dictionary inside the
+      tablespace. When MySQL drops a table or a database, it has to
+      delete both an <filename>.frm</filename> file or files, and the
+      corresponding entries inside the <literal>InnoDB</literal> data
+      dictionary. This is the reason why you cannot move
+      <literal>InnoDB</literal> tables between databases simply by
+      moving the <filename>.frm</filename> files. It is also the reason
+      why <literal>DROP DATABASE</literal> did not work for
+      <literal>InnoDB</literal> type tables before MySQL 3.23.44.
+    </para>
+
+    <para>
+      Every <literal>InnoDB</literal> table has a special index called
+      the <firstterm>clustered index</firstterm> where the data for the
+      rows is stored. If you define a <literal>PRIMARY KEY</literal> on
+      your table, the index of the primary key is the clustered index.
+    </para>
+
+    <para>
+      If you do not define a <literal>PRIMARY KEY</literal> for your
+      table, MySQL picks the first <literal>UNIQUE</literal> index that
+      has only <literal>NOT NULL</literal> columns as the primary key
+      and <literal>InnoDB</literal> uses it as the clustered index. If
+      there is no such index in the table, <literal>InnoDB</literal>
+      internally generates a clustered index where the rows are ordered
+      by the row ID that <literal>InnoDB</literal> assigns to the rows
+      in such a table. The row ID is a 6-byte field that increases
+      monotonically as new rows are inserted. Thus, the rows ordered by
+      the row ID are physically in insertion order.
+    </para>
+
+    <para>
+      Accessing a row through the clustered index is fast because the
+      row data is on the same page where the index search leads. If a
+      table is large, the clustered index architecture often saves a
+      disk I/O when compared to the traditional solution. (In many
+      database systems, data storage uses a different page from the
+      index record.)
+    </para>
+
+    <para>
+      In <literal>InnoDB</literal>, the records in non-clustered indexes
+      (also called secondary indexes) contain the primary key value for
+      the row. <literal>InnoDB</literal> uses this primary key value to
+      search for the row from the clustered index. Note that if the
+      primary key is long, the secondary indexes use more space.
+    </para>
+
+    <para>
+      <literal>InnoDB</literal> compares <literal>CHAR</literal>
and
+      <literal>VARCHAR</literal> strings of different lengths such that
+      the remaining length in the shorter string is treated as if padded
+      with spaces.
+    </para>
+
+    <section id="innodb-physical-structure">
+
+      <title>Physical Structure of an Index</title>
+
+      <para>
+        All <literal>InnoDB</literal> indexes are B-trees where the
+        index records are stored in the leaf pages of the tree. The
+        default size of an index page is 16KB. When new records are
+        inserted, <literal>InnoDB</literal> tries to leave 1/16 of the
+        page free for future insertions and updates of the index
+        records.
+      </para>
+
+      <para>
+        If index records are inserted in a sequential order (ascending
+        or descending), the resulting index pages are about 15/16 full.
+        If records are inserted in a random order, the pages are from
+        1/2 to 15/16 full. If the fill factor of an index page drops
+        below 1/2, <literal>InnoDB</literal> tries to contract the index
+        tree to free the page.
+      </para>
+
+    </section>
+
+    <section id="innodb-insert-buffering">
+
+      <title>Insert Buffering</title>
+
+      <para>
+        It is a common situation in database applications that the
+        primary key is a unique identifier and new rows are inserted in
+        the ascending order of the primary key. Thus, the insertions to
+        the clustered index do not require random reads from a disk.
+      </para>
+
+      <para>
+        On the other hand, secondary indexes are usually non-unique, and
+        insertions into secondary indexes happen in a relatively random
+        order. This would cause a lot of random disk I/O operations
+        without a special mechanism used in <literal>InnoDB</literal>.
+      </para>
+
+      <para>
+        If an index record should be inserted to a non-unique secondary
+        index, <literal>InnoDB</literal> checks whether the secondary
+        index page is in the buffer pool. If that is the case,
+        <literal>InnoDB</literal> does the insertion directly to the
+        index page. If the index page is not found in the buffer pool,
+        <literal>InnoDB</literal> inserts the record to a special insert
+        buffer structure. The insert buffer is kept so small that it
+        fits entirely in the buffer pool, and insertions can be done
+        very fast.
+      </para>
+
+      <para>
+        Periodically, the insert buffer is merged into the secondary
+        index trees in the database. Often it is possible to merge
+        several insertions to the same page of the index tree, saving
+        disk I/O operations. It has been measured that the insert buffer
+        can speed up insertions into a table up to 15 times.
+      </para>
+
+      <para>
+        The insert buffer merging may continue to happen
+        <emphasis>after</emphasis> the inserting transaction has been
+        committed. In fact, it may continue to happen after a server
+        shutdown and restart (see <xref linkend="forcing-recovery"/>).
+      </para>
+
+      <para>
+        The insert buffer merging may take many hours, when many
+        secondary indexes must be updated, and many rows have been
+        inserted. During this time, disk I/O will be increased, which
+        can cause significant slowdown on disk-bound queries. Another
+        significant background I/O operation is the purge thread (see
+        <xref linkend="innodb-multi-versioning"/>).
+      </para>
+
+    </section>
+
+    <section id="innodb-adaptive-hash">
+
+      <title>Adaptive Hash Indexes</title>
+
+      <para>
+        If a table fits almost entirely in main memory, the fastest way
+        to perform queries on it is to use hash indexes.
+        <literal>InnoDB</literal> has a mechanism that monitors index
+        searches made to the indexes defined for a table. If
+        <literal>InnoDB</literal> notices that queries could benefit
+        from building a hash index, it does so automatically.
+      </para>
+
+      <para>
+        Note that the hash index is always built based on an existing
+        B-tree index on the table. <literal>InnoDB</literal> can build a
+        hash index on a prefix of any length of the key defined for the
+        B-tree, depending on the pattern of searches that
+        <literal>InnoDB</literal> observes for the B-tree index. A hash
+        index can be partial: It is not required that the whole B-tree
+        index is cached in the buffer pool. <literal>InnoDB</literal>
+        builds hash indexes on demand for those pages of the index that
+        are often accessed.
+      </para>
+
+      <para>
+        In a sense, <literal>InnoDB</literal> tailors itself through the
+        adaptive hash index mechanism to ample main memory, coming
+        closer to the architecture of main-memory databases.
+      </para>
+
+    </section>
+
+    <section id="innodb-physical-record">
+
+      <title>Physical Row Structure</title>
+
+      <para>
+        Records in <literal>InnoDB</literal> tables have the following
+        characteristics:
+      </para>
+
+      <itemizedlist>
+
+        <listitem>
+          <para>
+            Each index record contains a six-byte header. The header is
+            used to link together consecutive records, and also in
+            row-level locking.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            Records in the clustered index contain fields for all
+            user-defined columns. In addition, there is a six-byte field
+            for the transaction ID and a seven-byte field for the roll
+            pointer.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            If no primary key was defined for a table, each clustered
+            index record also contains a six-byte row ID field.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            Each secondary index record contains also all the fields
+            defined for the clustered index key.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            A record contains also a pointer to each field of the
+            record. If the total length of the fields in a record is
+            less than 128 bytes, the pointer is one byte; otherwise, two
+            bytes. The array of these pointers is called the record
+            directory. The area where these pointers point is called the
+            data part of the record.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            Internally, <literal>InnoDB</literal> stores fixed-length
+            character columns such as <literal>CHAR(10)</literal> in a
+            fixed-length format. <literal>InnoDB</literal> truncates
+            trailing spaces from <literal>VARCHAR</literal> columns.
+            Note that MySQL may internally convert
+            <literal>CHAR</literal> columns to
+            <literal>VARCHAR</literal>. See
+            <xref linkend="silent-column-changes"/>.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            An SQL <literal>NULL</literal> value reserves 1 or 2 bytes
+            in the record directory. Besides that, an SQL
+            <literal>NULL</literal> value reserves zero bytes in the
+            data part of the record if stored in a variable length
+            column. In a fixed-length column, it reserves the fixed
+            length of the column in the data part of the record. The
+            motivation behind reserving the fixed space for
+            <literal>NULL</literal> values is that it enables an update
+            of the column from <literal>NULL</literal> to a
+            non-<literal>NULL</literal> value to be done in place
+            without causing fragmentation of the index page.
+          </para>
+        </listitem>
+
+      </itemizedlist>
+
+    </section>
+
+  </section>
+
+  <section id="file-space-management">
+
+    <title><literal>InnoDB</literal> File Space Management and Disk
I/O</title>
+
+    <section id="innodb-disk-io">
+
+      <title><literal>InnoDB</literal> Disk I/O</title>
+
+      <para>
+        <literal>InnoDB</literal> uses simulated asynchronous disk I/O:
+        <literal>InnoDB</literal> creates a number of threads to take
+        care of I/O operations, such as read-ahead.
+      </para>
+
+      <para>
+        There are two read-ahead heuristics in
+        <literal>InnoDB</literal>:
+      </para>
+
+      <itemizedlist>
+
+        <listitem>
+          <para>
+            In sequential read-ahead, if <literal>InnoDB</literal>
+            notices that the access pattern to a segment in the
+            tablespace is sequential, it posts in advance a batch of
+            reads of database pages to the I/O system.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            In random read-ahead, if <literal>InnoDB</literal> notices
+            that some area in a tablespace seems to be in the process of
+            being fully read into the buffer pool, it posts the
+            remaining reads to the I/O system.
+          </para>
+        </listitem>
+
+      </itemizedlist>
+
+      <para>
+        <literal>InnoDB</literal> uses a novel file flush technique
+        called <firstterm>doublewrite</firstterm>. It adds safety to
+        recovery following an operating system crash or a power outage,
+        and improves performance on most varieties of Unix by reducing
+        the need for <literal>fsync()</literal> operations.
+      </para>
+
+      <para>
+        Doublewrite means that before writing pages to a data file,
+        <literal>InnoDB</literal> first writes them to a contiguous
+        tablespace area called the doublewrite buffer. Only after the
+        write and the flush to the doublewrite buffer has completed does
+        <literal>InnoDB</literal> write the pages to their proper
+        positions in the data file. If the operating system crashes in
+        the middle of a page write, <literal>InnoDB</literal> can later
+        find a good copy of the page from the doublewrite buffer during
+        recovery.
+      </para>
+
+    </section>
+
+    <section id="innodb-file-space">
+
+      <title>File Space Management</title>
+
+      <para>
+        The data files that you define in the configuration file form
+        the tablespace of <literal>InnoDB</literal>. The files are
+        simply concatenated to form the tablespace. There is no striping
+        in use. Currently, you cannot define where within the tablespace
+        your tables are allocated. However, in a newly created
+        tablespace, <literal>InnoDB</literal> allocates space starting
+        from the first data file.
+      </para>
+
+      <para>
+        The tablespace consists of database pages with a default size of
+        16KB. The pages are grouped into extents of 64 consecutive
+        pages. The <quote>files</quote> inside a tablespace are called
+        <firstterm>segments</firstterm> in
<literal>InnoDB</literal>.
+        The term <quote>rollback segment</quote> is somewhat confusing
+        because it actually contains many tablespace segments.
+      </para>
+
+      <para>
+        Two segments are allocated for each index in
+        <literal>InnoDB</literal>. One is for non-leaf nodes of the
+        B-tree, the other is for the leaf nodes. The idea here is to
+        achieve better sequentiality for the leaf nodes, which contain
+        the data.
+      </para>
+
+      <para>
+        When a segment grows inside the tablespace,
+        <literal>InnoDB</literal> allocates the first 32 pages to it
+        individually. After that <literal>InnoDB</literal> starts to
+        allocate whole extents to the segment. <literal>InnoDB</literal>
+        can add to a large segment up to 4 extents at a time to ensure
+        good sequentiality of data.
+      </para>
+
+      <para>
+        Some pages in the tablespace contain bitmaps of other pages, and
+        therefore a few extents in an <literal>InnoDB</literal>
+        tablespace cannot be allocated to segments as a whole, but only
+        as individual pages.
+      </para>
+
+      <para>
+        When you ask for available free space in the tablespace by
+        issuing a <literal>SHOW TABLE STATUS</literal> statement,
+        <literal>InnoDB</literal> reports the extents that are
+        definitely free in the tablespace. <literal>InnoDB</literal>
+        always reserves some extents for cleanup and other internal
+        purposes; these reserved extents are not included in the free
+        space.
+      </para>
+
+      <para>
+        When you delete data from a table, <literal>InnoDB</literal>
+        contracts the corresponding B-tree indexes. Whether the freed
+        space becomes available for other users depends on whether the
+        pattern of deletes frees individual pages or extents to the
+        tablespace. Dropping a table or deleting all rows from it is
+        guaranteed to release the space to other users, but remember
+        that deleted rows are physically removed only in an (automatic)
+        purge operation after they are no longer needed for transaction
+        rollbacks or consistent reads. (See
+        <xref linkend="innodb-multi-versioning"/>.)
+      </para>
+
+    </section>
+
+    <section id="innodb-file-defragmenting">
+
+      <title>Defragmenting a Table</title>
+
+      <para>
+        If there are random insertions into or deletions from the
+        indexes of a table, the indexes may become fragmented.
+        Fragmentation means that the physical ordering of the index
+        pages on the disk is not close to the index ordering of the
+        records on the pages, or that there are many unused pages in the
+        64-page blocks that were allocated to the index.
+      </para>
+
+      <para>
+        A symptom of fragmentation is that a table takes more space than
+        it <quote>should</quote> take. How much that is exactly, is
+        difficult to determine. All <literal>InnoDB</literal> data and
+        indexes are stored in B-trees, and their fill factor may vary
+        from 50% to 100%. Another symptom of fragmentation is that a
+        table scan such as this takes more time than it
+        <quote>should</quote> take:
+      </para>
+
+<programlisting>
+SELECT COUNT(*) FROM t WHERE a_non_indexed_column &lt;&gt; 12345;
+</programlisting>
+
+      <para>
+        (In the preceding query, we are <quote>fooling</quote> the SQL
+        optimizer into scanning the clustered index, rather than a
+        secondary index.) Most disks can read 10 to 50MB/s, which can be
+        used to estimate how fast a table scan should run.
+      </para>
+
+      <para>
+        It can speed up index scans if you periodically perform a
+        <quote>null</quote> <literal>ALTER TABLE</literal>
operation:
+      </para>
+
+<programlisting>
+ALTER TABLE <replaceable>tbl_name</replaceable> TYPE=InnoDB;
+</programlisting>
+
+      <para>
+        That causes MySQL to rebuild the table. Another way to perform a
+        defragmentation operation is to use <command>mysqldump</command>
+        to dump the table to a text file, drop the table, and reload it
+        from the dump file.
+      </para>
+
+      <para>
+        If the insertions to an index are always ascending and records
+        are deleted only from the end, the <literal>InnoDB</literal>
+        filespace management algorithm guarantees that fragmentation in
+        the index does not occur.
+      </para>
+
+    </section>
+
+  </section>
+
+  <section id="innodb-error-handling">
+
+    <title><literal>InnoDB</literal> Error Handling</title>
+
+    <para>
+      Error handling in <literal>InnoDB</literal> is not always the same
+      as specified in the SQL standard. According to the standard, any
+      error during an SQL statement should cause the rollback of that
+      statement. <literal>InnoDB</literal> sometimes rolls back only
+      part of the statement, or the whole transaction. The following
+      items describe how <literal>InnoDB</literal> performs error
+      handling:
+    </para>
+
+    <itemizedlist>
+
+      <listitem>
+        <para>
+          If you run out of file space in the tablespace, a MySQL
+          <literal>Table is full</literal> error occurs and
+          <literal>InnoDB</literal> rolls back the SQL statement.
+        </para>
+      </listitem>
+
+      <listitem>
+        <para>
+          A transaction deadlock or a timeout in a lock wait causes
+          <literal>InnoDB</literal> to roll back the whole transaction.
+        </para>
+
+        <para>
+          When a transaction rollback occurs due to a deadlock or lock
+          wait timeout, it cancels the effect of the statements within
+          the transaction. But if the start-transaction statement was
+          <literal>START TRANSACTION</literal> or
+          <literal>BEGIN</literal> statement, rollback does not cancel
+          that statement. Further SQL statements become part of the
+          transaction until the occurrence of <literal>COMMIT</literal>,
+          <literal>ROLLBACK</literal>, or some SQL statement that causes
+          an implicit commit.
+        </para>
+      </listitem>
+
+      <listitem>
+        <para>
+          A duplicate-key error rolls back the SQL statement, if you
+          have not specified the <literal>IGNORE</literal> option in
+          your statement.
+        </para>
+      </listitem>
+
+      <listitem>
+        <para>
+          A <literal>row too long error</literal> rolls back the SQL
+          statement.
+        </para>
+      </listitem>
+
+      <listitem>
+        <para>
+          Other errors are mostly detected by the MySQL layer of code
+          (above the <literal>InnoDB</literal> storage engine level),
+          and they roll back the corresponding SQL statement. Locks are
+          not released in a rollback of a single SQL statement.
+        </para>
+      </listitem>
+
+    </itemizedlist>
+
+    <para>
+      During such implicit rollbacks, as well as during the explicit
+      <literal>ROLLBACK</literal> SQL statement, <literal>SHOW
+      PROCESSLIST</literal> displays "Rolling back" in the
+      <literal>State</literal> column for the connection (starting from
+      MySQL 4.1.8).
+    </para>
+
+    <section id="innodb-error-codes">
+
+      <title><literal>InnoDB</literal> Error Codes</title>
+
+      <para>
+        The following is a non-exhaustive list of common
+        <literal>InnoDB</literal>-specific errors that you may
+        encounter, with information about why each occurs and how to
+        resolve the problem.
+      </para>
+
+      <itemizedlist>
+
+        <listitem>
+          <para>
+            <literal>1005 (ER_CANT_CREATE_TABLE)</literal>
+          </para>
+
+          <para>
+            Cannot create table. If the error message refers to
+            <literal>errno</literal> 150, table creation failed because
+            a foreign key constraint was not correctly formed. If the
+            error message refers to <literal>errno</literal> -1, table
+            creation probably failed because the table included a column
+            name that matched the name of an internal InnoDB table.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>1016 (ER_CANT_OPEN_FILE)</literal>
+          </para>
+
+          <para>
+            Cannot find the <literal>InnoDB</literal> table from the
+            <literal>InnoDB</literal> data files, although the
+            <filename>.frm</filename> file for the table exists. See
+            <xref linkend="innodb-troubleshooting-datadict"/>.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>1114 (ER_RECORD_FILE_FULL)</literal>
+          </para>
+
+          <para>
+            <literal>InnoDB</literal> has run out of free space in the
+            tablespace. You should reconfigure the tablespace to add a
+            new data file.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>1205 (ER_LOCK_WAIT_TIMEOUT)</literal>
+          </para>
+
+          <para>
+            Lock wait timeout expired. Transaction was rolled back.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>1213 (ER_LOCK_DEADLOCK)</literal>
+          </para>
+
+          <para>
+            Transaction deadlock. You should rerun the transaction.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>1216 (ER_NO_REFERENCED_ROW)</literal>
+          </para>
+
+          <para>
+            You are trying to add a row but there is no parent row, and
+            a foreign key constraint fails. You should add the parent
+            row first.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>1217 (ER_ROW_IS_REFERENCED)</literal>
+          </para>
+
+          <para>
+            You are trying to delete a parent row that has children, and
+            a foreign key constraint fails. You should delete the
+            children first.
+          </para>
+        </listitem>
+
+      </itemizedlist>
+
+    </section>
+
+    <section id="operating-system-error-codes">
+
+      <title>Operating System Error Codes</title>
+
+      <para>
+        To print the meaning of an operating system error number, use
+        the <command>perror</command> program that comes with the MySQL
+        distribution.
+      </para>
+
+      <remark role="todo">
+        CHECK URL
+      </remark>
+
+      <para>
+        The following table provides a list of some common Linux system
+        error codes. For a more complete list, see
+        <ulink
url="http://www.iglu.org.il/lxr/source/include/asm-i386/errno.h">Linux
+        source code</ulink>.
+      </para>
+
+      <itemizedlist>
+
+        <listitem>
+          <para>
+            <literal>1 (EPERM)</literal>
+          </para>
+
+          <para>
+            Operation not permitted
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>2 (ENOENT)</literal>
+          </para>
+
+          <para>
+            No such file or directory
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>3 (ESRCH)</literal>
+          </para>
+
+          <para>
+            No such process
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>4 (EINTR)</literal>
+          </para>
+
+          <para>
+            Interrupted system call
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>5 (EIO)</literal>
+          </para>
+
+          <para>
+            I/O error
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>6 (ENXIO)</literal>
+          </para>
+
+          <para>
+            No such device or address
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>7 (E2BIG)</literal>
+          </para>
+
+          <para>
+            Arg list too long
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>8 (ENOEXEC)</literal>
+          </para>
+
+          <para>
+            Exec format error
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>9 (EBADF)</literal>
+          </para>
+
+          <para>
+            Bad file number
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>10 (ECHILD)</literal>
+          </para>
+
+          <para>
+            No child processes
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>11 (EAGAIN)</literal>
+          </para>
+
+          <para>
+            Try again
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>12 (ENOMEM)</literal>
+          </para>
+
+          <para>
+            Out of memory
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>13 (EACCES)</literal>
+          </para>
+
+          <para>
+            Permission denied
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>14 (EFAULT)</literal>
+          </para>
+
+          <para>
+            Bad address
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>15 (ENOTBLK)</literal>
+          </para>
+
+          <para>
+            Block device required
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>16 (EBUSY)</literal>
+          </para>
+
+          <para>
+            Device or resource busy
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>17 (EEXIST)</literal>
+          </para>
+
+          <para>
+            File exists
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>18 (EXDEV)</literal>
+          </para>
+
+          <para>
+            Cross-device link
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>19 (ENODEV)</literal>
+          </para>
+
+          <para>
+            No such device
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>20 (ENOTDIR)</literal>
+          </para>
+
+          <para>
+            Not a directory
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>21 (EISDIR)</literal>
+          </para>
+
+          <para>
+            Is a directory
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>22 (EINVAL)</literal>
+          </para>
+
+          <para>
+            Invalid argument
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>23 (ENFILE)</literal>
+          </para>
+
+          <para>
+            File table overflow
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>24 (EMFILE)</literal>
+          </para>
+
+          <para>
+            Too many open files
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>25 (ENOTTY)</literal>
+          </para>
+
+          <para>
+            Inappropriate ioctl for device
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>26 (ETXTBSY)</literal>
+          </para>
+
+          <para>
+            Text file busy
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>27 (EFBIG)</literal>
+          </para>
+
+          <para>
+            File too large
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>28 (ENOSPC)</literal>
+          </para>
+
+          <para>
+            No space left on device
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>29 (ESPIPE)</literal>
+          </para>
+
+          <para>
+            Illegal seek
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>30 (EROFS)</literal>
+          </para>
+
+          <para>
+            Read-only file system
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>31 (EMLINK)</literal>
+          </para>
+
+          <para>
+            Too many links
+          </para>
+        </listitem>
+
+      </itemizedlist>
+
+      <para>
+        The following table provides a list of some common Windows
+        system error codes. For a complete list see the
+        <ulink
url="http://msdn.microsoft.com/library/default.asp?url=/library/en-us/debug/base/system_error_codes.asp">Microsoft
+        Web site</ulink>.
+      </para>
+
+      <itemizedlist>
+
+        <listitem>
+          <para>
+            <literal>1 (ERROR_INVALID_FUNCTION)</literal>
+          </para>
+
+          <para>
+            Incorrect function.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>2 (ERROR_FILE_NOT_FOUND)</literal>
+          </para>
+
+          <para>
+            The system cannot find the file specified.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>3 (ERROR_PATH_NOT_FOUND)</literal>
+          </para>
+
+          <para>
+            The system cannot find the path specified.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>4 (ERROR_TOO_MANY_OPEN_FILES)</literal>
+          </para>
+
+          <para>
+            The system cannot open the file.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>5 (ERROR_ACCESS_DENIED)</literal>
+          </para>
+
+          <para>
+            Access is denied.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>6 (ERROR_INVALID_HANDLE)</literal>
+          </para>
+
+          <para>
+            The handle is invalid.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>7 (ERROR_ARENA_TRASHED)</literal>
+          </para>
+
+          <para>
+            The storage control blocks were destroyed.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>8 (ERROR_NOT_ENOUGH_MEMORY)</literal>
+          </para>
+
+          <para>
+            Not enough storage is available to process this command.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>9 (ERROR_INVALID_BLOCK)</literal>
+          </para>
+
+          <para>
+            The storage control block address is invalid.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>10 (ERROR_BAD_ENVIRONMENT)</literal>
+          </para>
+
+          <para>
+            The environment is incorrect.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>11 (ERROR_BAD_FORMAT)</literal>
+          </para>
+
+          <para>
+            An attempt was made to load a program with an incorrect
+            format.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>12 (ERROR_INVALID_ACCESS)</literal>
+          </para>
+
+          <para>
+            The access code is invalid.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>13 (ERROR_INVALID_DATA)</literal>
+          </para>
+
+          <para>
+            The data is invalid.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>14 (ERROR_OUTOFMEMORY)</literal>
+          </para>
+
+          <para>
+            Not enough storage is available to complete this operation.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>15 (ERROR_INVALID_DRIVE)</literal>
+          </para>
+
+          <para>
+            The system cannot find the drive specified.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>16 (ERROR_CURRENT_DIRECTORY)</literal>
+          </para>
+
+          <para>
+            The directory cannot be removed.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>17 (ERROR_NOT_SAME_DEVICE)</literal>
+          </para>
+
+          <para>
+            The system cannot move the file to a different disk drive.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>18 (ERROR_NO_MORE_FILES)</literal>
+          </para>
+
+          <para>
+            There are no more files.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>19 (ERROR_WRITE_PROTECT)</literal>
+          </para>
+
+          <para>
+            The media is write protected.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>20 (ERROR_BAD_UNIT)</literal>
+          </para>
+
+          <para>
+            The system cannot find the device specified.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>21 (ERROR_NOT_READY)</literal>
+          </para>
+
+          <para>
+            The device is not ready.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>22 (ERROR_BAD_COMMAND)</literal>
+          </para>
+
+          <para>
+            The device does not recognize the command.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>23 (ERROR_CRC)</literal>
+          </para>
+
+          <para>
+            Data error (cyclic redundancy check).
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>24 (ERROR_BAD_LENGTH)</literal>
+          </para>
+
+          <para>
+            The program issued a command but the command length is
+            incorrect.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>25 (ERROR_SEEK)</literal>
+          </para>
+
+          <para>
+            The drive cannot locate a specific area or track on the
+            disk.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>26 (ERROR_NOT_DOS_DISK)</literal>
+          </para>
+
+          <para>
+            The specified disk or diskette cannot be accessed.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>27 (ERROR_SECTOR_NOT_FOUND)</literal>
+          </para>
+
+          <para>
+            The drive cannot find the sector requested.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>28 (ERROR_OUT_OF_PAPER)</literal>
+          </para>
+
+          <para>
+            The printer is out of paper.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>29 (ERROR_WRITE_FAULT)</literal>
+          </para>
+
+          <para>
+            The system cannot write to the specified device.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>30 (ERROR_READ_FAULT)</literal>
+          </para>
+
+          <para>
+            The system cannot read from the specified device.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>31 (ERROR_GEN_FAILURE)</literal>
+          </para>
+
+          <para>
+            A device attached to the system is not functioning.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>32 (ERROR_SHARING_VIOLATION)</literal>
+          </para>
+
+          <para>
+            The process cannot access the file because it is being used
+            by another process.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>33 (ERROR_LOCK_VIOLATION)</literal>
+          </para>
+
+          <para>
+            The process cannot access the file because another process
+            has locked a portion of the file.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>34 (ERROR_WRONG_DISK)</literal>
+          </para>
+
+          <para>
+            The wrong diskette is in the drive. Insert %2 (Volume Serial
+            Number: %3) into drive %1.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>36 (ERROR_SHARING_BUFFER_EXCEEDED)</literal>
+          </para>
+
+          <para>
+            Too many files opened for sharing.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>38 (ERROR_HANDLE_EOF)</literal>
+          </para>
+
+          <para>
+            Reached the end of the file.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>39 (ERROR_HANDLE_DISK_FULL)</literal>
+          </para>
+
+          <para>
+            The disk is full.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>87 (ERROR_INVALID_PARAMETER)</literal>
+          </para>
+
+          <para>
+            The parameter is incorrect. (If this error occurs on MySQL
+            4.1.9 on Windows and you have set
+            <literal>innodb_file_per_table</literal> in a server option
+            file, this is Bug #8021, and a workaround is to add the line
+            <literal>innodb_flush_method=unbuffered</literal> to the
+            file as well.)
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>112 (ERROR_DISK_FULL)</literal>
+          </para>
+
+          <para>
+            The disk is full.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>123 (ERROR_INVALID_NAME)</literal>
+          </para>
+
+          <para>
+            The filename, directory name, or volume label syntax is
+            incorrect.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>1450 (ERROR_NO_SYSTEM_RESOURCES)</literal>
+          </para>
+
+          <para>
+            Insufficient system resources exist to complete the
+            requested service.
+          </para>
+        </listitem>
+
+      </itemizedlist>
+
+    </section>
+
+  </section>
+
+  <section id="innodb-restrictions">
+
+    <title>Restrictions on <literal>InnoDB</literal>
Tables</title>
+
+    <itemizedlist>
+
+      <listitem>
+        <warning>
+          <para>
+            Do <emphasis>not</emphasis> convert MySQL system tables in
+            the <literal>mysql</literal> database from
+            <literal>MyISAM</literal> to
<literal>InnoDB</literal>
+            tables! This is an unsupported operation. If you do this,
+            MySQL does not restart until you restore the old system
+            tables from a backup or re-generate them with the
+            <command>mysql_install_db</command> script.
+          </para>
+        </warning>
+      </listitem>
+
+      <listitem>
+        <warning>
+          <para>
+            <indexterm>
+              <primary>NFS</primary>
+              <secondary>InnoDB</secondary>
+            </indexterm>
+
+            <indexterm>
+              <primary>InnoDB</primary>
+              <secondary>NFS</secondary>
+            </indexterm>
+
+            It is not a good idea to configure <literal>InnoDB</literal>
+            to use datafiles or logfiles on NFS volumes. Otherwise, the
+            files might be locked by other processes and become
+            unavailable for use by MySQL.
+          </para>
+        </warning>
+      </listitem>
+
+      <listitem>
+        <para>
+          A table cannot contain more than 1000 columns.
+        </para>
+      </listitem>
+
+      <listitem>
+        <para>
+          The internal maximum key length is 3500 bytes, but MySQL
+          itself restricts this to 1024 bytes.
+        </para>
+      </listitem>
+
+      <listitem>
+        <para>
+          The maximum row length, except for
+          <literal>VARBINARY</literal>,
<literal>VARCHAR</literal>,
+          <literal>BLOB</literal> and <literal>TEXT</literal>
columns,
+          is slightly less than half of a database page. That is, the
+          maximum row length is about 8000 bytes.
+          <literal>LONGBLOB</literal> and
<literal>LONGTEXT</literal>
+          columns must be less than 4GB, and the total row length,
+          including also <literal>BLOB</literal> and
+          <literal>TEXT</literal> columns, must be less than 4GB.
+          <literal>InnoDB</literal> stores the first 768 bytes of a
+          <literal>VARBINARY</literal>,
<literal>VARCHAR</literal>,
+          <literal>BLOB</literal>, or <literal>TEXT</literal>
column in
+          the row, and the rest into separate pages.
+        </para>
+      </listitem>
+
+      <listitem>
+        <para>
+          On some older operating systems, files must be less than 2GB.
+          This is not a limitation of <literal>InnoDB</literal> itself,
+          but if you require a large tablespace, you will need to
+          configure it using several smaller data files rather than one
+          or a file large data files.
+        </para>
+      </listitem>
+
+      <listitem>
+        <para>
+          The combined size of the <literal>InnoDB</literal> log files
+          must be less than 4GB.
+        </para>
+      </listitem>
+
+      <listitem>
+        <para>
+          The minimum tablespace size is 10MB. The maximum tablespace
+          size is four billion database pages (64TB). This is also the
+          maximum size for a table.
+        </para>
+      </listitem>
+
+      <listitem>
+        <para>
+          <literal>InnoDB</literal> tables do not support
+          <literal>FULLTEXT</literal> indexes.
+        </para>
+      </listitem>
+
+      <listitem>
+        <para>
+          <literal>InnoDB</literal> tables do not support spatial data
+          types.
+        </para>
+      </listitem>
+
+      <listitem>
+        <para>
+          <literal>ANALYZE TABLE</literal> determines index cardinality
+          (as displayed in the <literal>Cardinality</literal> column of
+          <literal>SHOW INDEX</literal> output) by doing ten random
+          dives to each of the index trees and updating index
+          cardinality estimates accordingly. Note that because these are
+          only estimates, repeated runs of <literal>ANALYZE
+          TABLE</literal> may produce different numbers. This makes
+          <literal>ANALYZE TABLE</literal> fast on
+          <literal>InnoDB</literal> tables but not 100% accurate as it
+          doesn't take all rows into account.
+        </para>
+
+        <para>
+          MySQL uses index cardinality estimates only in join
+          optimization. If some join is not optimized in the right way,
+          you can try using <literal>ANALYZE TABLE</literal>. In the few
+          cases that <literal>ANALYZE TABLE</literal> doesn't produce
+          values good enough for your particular tables, you can use
+          <literal>FORCE INDEX</literal> with your queries to force the
+          use of a particular index, or set the
+          <literal>max_seeks_for_key</literal> system variable to ensure
+          that MySQL prefers index lookups over table scans. See
+          <xref linkend="server-system-variables"/>, and
+          <xref linkend="optimizer-issues"/>.
+        </para>
+      </listitem>
+
+      <listitem>
+        <para>
+          <literal>SHOW TABLE STATUS</literal> does not give accurate
+          statistics on <literal>InnoDB</literal> tables, except for the
+          physical size reserved by the table. The row count is only a
+          rough estimate used in SQL optimization.
+        </para>
+      </listitem>
+
+      <listitem>
+        <para>
+          <literal>InnoDB</literal> does not keep an internal count of
+          rows in a table. (In practice, this would be somewhat
+          complicated due to multi-versioning.) To process a
+          <literal>SELECT COUNT(*) FROM t</literal> statement,
+          <literal>InnoDB</literal> must scan an index of the table,
+          which takes some time if the index is not entirely in the
+          buffer pool. To get a fast count, you have to use a counter
+          table you create yourself and let your application update it
+          according to the inserts and deletes it does. If your table
+          does not change often, using the MySQL query cache is a good
+          solution. <literal>SHOW TABLE STATUS</literal> also can be
+          used if an approximate row count is sufficient. See
+          <xref linkend="innodb-tuning"/>.
+        </para>
+      </listitem>
+
+      <listitem>
+        <para>
+          On Windows, <literal>InnoDB</literal> always stores database
+          and table names internally in lowercase. To move databases in
+          binary format from Unix to Windows or from Windows to Unix,
+          you should always use explicitly lowercase names when creating
+          databases and tables.
+        </para>
+      </listitem>
+
+      <listitem>
+        <para>
+          For an <literal>AUTO_INCREMENT</literal> column, you must
+          always define an index for the table, and that index must
+          contain just the <literal>AUTO_INCREMENT</literal> column. In
+          <literal>MyISAM</literal> tables, the
+          <literal>AUTO_INCREMENT</literal> column may be part of a
+          multi-column index.
+        </para>
+      </listitem>
+
+      <listitem>
+        <para>
+          Before MySQL 4.1.12, <literal>InnoDB</literal> does not
+          support the <literal>AUTO_INCREMENT</literal> table option for
+          setting the initial sequence value in an <literal>ALTER
+          TABLE</literal> statement. Before MySQL 4.1.14, the same is
+          true for <literal>CREATE TABLE</literal>. To set the value
+          with <literal>InnoDB</literal>, insert a dummy row with a
+          value one less and delete that dummy row, or insert the first
+          row with an explicit value specified.
+        </para>
+      </listitem>
+
+      <listitem>
+        <para>
+          While initializing a previously specified
+          <literal>AUTO_INCREMENT</literal> column on a table,
+          <literal>InnoDB</literal> sets an exclusive lock on the end of
+          the index associated with the
+          <literal>AUTO_INCREMENT</literal> column. In accessing the
+          auto-increment counter, <literal>InnoDB</literal> uses a
+          specific table lock mode <literal>AUTO-INC</literal> where the
+          lock lasts only to the end of the current SQL statement, not
+          to the end of the entire transaction. Note that other clients
+          cannot insert into the table while the
+          <literal>AUTO-INC</literal> table lock is held; see
+          <xref linkend="innodb-and-autocommit"/>.
+        </para>
+      </listitem>
+
+      <listitem>
+        <para>
+          When you restart the MySQL server, <literal>InnoDB</literal>
+          may reuse an old value that was generated for an
+          <literal>AUTO_INCREMENT</literal> column but never stored
+          (that is, a value that was generated during an old transaction
+          that was rolled back).
+        </para>
+      </listitem>
+
+      <listitem>
+        <para>
+          When an <literal>AUTO_INCREMENT</literal> column runs out of
+          values, <literal>InnoDB</literal> wraps a
+          <literal>BIGINT</literal> to
+          <literal>-9223372036854775808</literal> and <literal>BIGINT
+          UNSIGNED</literal> to <literal>1</literal>. However,
+          <literal>BIGINT</literal> values have 64 bits, so do note that
+          if you were to insert one million rows per second, it would
+          still take nearly three hundred thousand years before
+          <literal>BIGINT</literal> reached its upper bound. With all
+          other integer type columns, a duplicate-key error results.
+          This is similar to how <literal>MyISAM</literal> works,
+          because it is mostly general MySQL behavior and not about any
+          storage engine in particular.
+        </para>
+      </listitem>
+
+      <listitem>
+        <para>
+          <literal>DELETE FROM
+          <replaceable>tbl_name</replaceable></literal> does not
+          regenerate the table but instead deletes all rows, one by one.
+        </para>
+      </listitem>
+
+      <listitem>
+        <para>
+          Under some conditions, <literal>TRUNCATE
+          <replaceable>tbl_name</replaceable></literal> for an
+          <literal>InnoDB</literal> table is mapped to <literal>DELETE
+          FROM+ <replaceable>tbl_name</replaceable></literal> and
+          doesn't reset the <literal>AUTO_INCREMENT</literal> counter.
+          See <xref linkend="truncate"/>.
+        </para>
+      </listitem>
+
+      <listitem>
+        <para>
+          Before MySQL 4.0.14 or 4.1.0, if you tried to create a unique
+          index on a prefix of a column you got an error:
+        </para>
+
+<programlisting>
+CREATE TABLE T (A CHAR(20), B INT, UNIQUE (A(5))) TYPE = InnoDB;
+</programlisting>
+
+        <para>
+          If you created a non-unique index on a prefix of a column,
+          <literal>InnoDB</literal> created an index over the whole
+          column. These restrictions were removed in MySQL 4.0.14.
+        </para>
+      </listitem>
+
+      <listitem>
+        <para>
+          Before MySQL 4.0.20 or 4.1.2, the MySQL <literal>LOCK
+          TABLES</literal> operation does not know about
+          <literal>InnoDB</literal> row-level locks set by completed SQL
+          statements. This means that you can get a table lock on a
+          table even if there still exist transactions by other users
+          who have row-level locks on the same table. Thus, your
+          operations on the table may have to wait if they collide with
+          these locks of other users. Also a deadlock is possible.
+          However, this does not endanger transaction integrity, because
+          the row-level locks set by <literal>InnoDB</literal> always
+          take care of the integrity. Also, a table lock prevents other
+          transactions from acquiring more row-level locks (in a
+          conflicting lock mode) on the table.
+        </para>
+      </listitem>
+
+      <listitem>
+        <para>
+          Beginning with MySQL 4.0.20 and 4.1.2, the MySQL <literal>LOCK
+          TABLES</literal> operation acquires two locks on each table if
+          <literal>innodb_table_locks=1</literal> (the default). In
+          addition to a table lock on the MySQL layer, it also acquires
+          an <literal>InnoDB</literal> table lock. Older versions of
+          MySQL do not acquire <literal>InnoDB</literal> table locks.
+          Beginning with MySQL 4.0.22 and 4.1.7, the old behavior can be
+          selected by setting <literal>innodb_table_locks=0</literal>.
+          If no <literal>InnoDB</literal> table lock is acquired,
+          <literal>LOCK TABLES</literal> completes even if some records
+          of the tables are being locked by other transactions.
+        </para>
+      </listitem>
+
+      <listitem>
+        <para>
+          All <literal>InnoDB</literal> locks held by a transaction are
+          released when the transaction is committed or aborted. Thus,
+          it does not make much sense to invoke <literal>LOCK
+          TABLES</literal> on <literal>InnoDB</literal> tables in
+          <literal>AUTOCOMMIT=1</literal> mode, because the acquired
+          <literal>InnoDB</literal> table locks would be released
+          immediately.
+        </para>
+      </listitem>
+
+      <listitem>
+        <para>
+          Sometimes it would be useful to lock further tables in the
+          course of a transaction. Unfortunately, <literal>LOCK
+          TABLES</literal> in MySQL performs an implicit
+          <literal>COMMIT</literal> and <literal>UNLOCK
+          TABLES</literal>. An <literal>InnoDB</literal> variant of
+          <literal>LOCK TABLES</literal> has been planned that can be
+          executed in the middle of a transaction.
+        </para>
+      </listitem>
+
+      <listitem>
+        <para>
+          Before MySQL 3.23.52, replication always ran with autocommit
+          enabled. Therefore consistent reads in the slave would also
+          see partially processed transactions, and thus the read would
+          not be really consistent in the slave. This restriction was
+          removed in MySQL 3.23.52.
+        </para>
+      </listitem>
+
+      <listitem>
+        <para>
+          The <literal>LOAD TABLE FROM MASTER</literal> statement for
+          setting up replication slave servers does not work for
+          <literal>InnoDB</literal> tables. A workaround is to alter the
+          table to <literal>MyISAM</literal> on the master, then do the
+          load, and after that alter