Author: mhillyer
Date: 2005-11-03 01:01:48 +0100 (Thu, 03 Nov 2005)
New Revision: 227
Log:
Work in Progress push so Stefan can proofread.
Modified:
branches/MikePluggable/trunk/refman-5.1/custom-engine.xml
Modified: branches/MikePluggable/trunk/refman-5.1/custom-engine.xml
===================================================================
--- branches/MikePluggable/trunk/refman-5.1/custom-engine.xml 2005-11-02 21:49:48 UTC (rev 226)
+++ branches/MikePluggable/trunk/refman-5.1/custom-engine.xml 2005-11-03 00:01:48 UTC (rev 227)
@@ -45,8 +45,10 @@
<title>Overview</title>
- <para>The MySQL server is build in a modular fashion:</para>
-
+ <para>
+ The MySQL server is build in a modular fashion:
+ </para>
+
<figure>
<title>MySQL architecture</title>
<mediaobject>
@@ -58,22 +60,40 @@
</textobject>
</mediaobject>
</figure>
-
- <para>The storage engines manage data storage and index management
- for MySQL. The MySQL server communicates with the storage engines
- through a defined API.</para>
-
- <para>Each storage engine is an inherited class with each instance
- of the class being referred to as a <literal>handler</literal>.
+
+ <para>
+ The storage engines manage data storage and index management for
+ MySQL. The MySQL server communicates with the storage engines
+ through a defined API.
</para>
-
- <para>Handlers are instanced on the basis of one handler for each
- thread that needs to work with a specific table. For example: If
- three connections all start working with the same table, three
- handler instances will need to be created.</para>
-
-
+ <para>
+ Each storage engine is an inherited class with each instance of
+ the class being referred to as a <literal>handler</literal>.
+ </para>
+
+ <para>
+ Handlers are instanced on the basis of one handler for each thread
+ that needs to work with a specific table. For example: If three
+ connections all start working with the same table, three handler
+ instances will need to be created.
+ </para>
+
+ <para>
+ One a handler instance is created, the MySQL server issues
+ commands to the handler to perform data storage and retrieval
+ tasks such as opening a table, manipulating rows, and managing
+ indexes.
+ </para>
+
+ <para>
+ Custom storage engines can be built in a progressive manner:
+ developers can start with read-only storage engine and later add
+ support for <literal>INSERT</literal>, <literal>UPDATE</literal>,
+ and <literal>DELETE</literal> operations and later add support for
+ indexing, transactions, and other advanced operations.
+ </para>
+
</section>
<section id="custom-engine-create-files">
@@ -88,18 +108,472 @@
example engine files.
</remark>
- <para/>
+ <para>
+ The easiest way to implement a new storage engine is to begin by
+ copying and modifying the <literal>EXAMPLE</literal> storage
+ engine. The <filename>ha_example.cc</filename> and
+ <filename>ha_example.h</filename> files can be found in the
+ <filename>sql/examples/</filename> directory of the MySQL 5.1
+ source tree.
+ </para>
+ <para>
+ When copying the files, change the names from ha_example to
+ something appropriate to your storage engine such as
+ <filename>ha_foo.cc</filename> and <filename>ha_foo.h</filename>.
+ </para>
+
+ <para>
+ After you have copied and renamed the files you must replace all
+ instances of <literal>EXAMPLE</literal> and
+ <literal>example</literal> with the name of your storage engine.
+ If you are familiar with <literal>sed</literal>, these steps can
+ be done automatically:
+ </para>
+
+<programlisting>
+sed s/EXAMPLE/FOO/g ha_example.h | sed s/example/foo/g ha_foo.h
+sed s/EXAMPLE/FOO/g ha_example.cc | sed s/example/foo/g ha_foo.cc
+</programlisting>
+
+ <para>
+ For information on accessing the MySQL 5.1 bitkeeper tree, see
+ <xref linkend="installing-source"/>.
+ </para>
+
</section>
<section id="custom-engine-handlerton">
<title>Creating the handlerton</title>
- <para/>
+ <para>
+ The <literal>handlerton</literal> (short for 'handler singleton')
+ defines the storage engine and contains function pointers to those
+ functions that apply to the storage engine as a whole as opposed
+ to functions that work inside a single handler instance. Some
+ examples of such functions include transaction functions to handle
+ commits and rollbacks.
+ </para>
+ <para>
+ Here's an example from the <literal>EXAMPLE</literal> storage
+ engine:
+ </para>
+
+<programlisting>
+handlerton example_hton= {
+ "EXAMPLE",
+ SHOW_OPTION_YES,
+ "Example storage engine",
+ DB_TYPE_EXAMPLE_DB,
+ NULL, /* Initialize */
+ 0, /* slot */
+ 0, /* savepoint size. */
+ NULL, /* close_connection */
+ NULL, /* savepoint */
+ NULL, /* rollback to savepoint */
+ NULL, /* release savepoint */
+ NULL, /* commit */
+ NULL, /* rollback */
+ NULL, /* prepare */
+ NULL, /* recover */
+ NULL, /* commit_by_xid */
+ NULL, /* rollback_by_xid */
+ NULL, /* create_cursor_read_view */
+ NULL, /* set_cursor_read_view */
+ NULL, /* close_cursor_read_view */
+ example_create_handler, /* Create a new handler */
+ NULL, /* Drop a database */
+ NULL, /* Panic call */
+ NULL, /* Release temporary latches */
+ NULL, /* Update Statistics */
+ NULL, /* Start Consistent Snapshot */
+ NULL, /* Flush logs */
+ NULL, /* Show status */
+ NULL, /* Replication Report Sent Binlog */
+ HTON_CAN_RECREATE
+};
+</programlisting>
+
+ <para>
+ This is the definition of the handlerton from
+ <filename>handler.h</filename>:
+ </para>
+
+<programlisting>
+typedef struct
+ {
+ const char *name;
+ SHOW_COMP_OPTION state;
+ const char *comment;
+ enum db_type db_type;
+ bool (*init)();
+ uint slot;
+ uint savepoint_offset;
+ int (*close_connection)(THD *thd);
+ int (*savepoint_set)(THD *thd, void *sv);
+ int (*savepoint_rollback)(THD *thd, void *sv);
+ int (*savepoint_release)(THD *thd, void *sv);
+ int (*commit)(THD *thd, bool all);
+ int (*rollback)(THD *thd, bool all);
+ int (*prepare)(THD *thd, bool all);
+ int (*recover)(XID *xid_list, uint len);
+ int (*commit_by_xid)(XID *xid);
+ int (*rollback_by_xid)(XID *xid);
+ void *(*create_cursor_read_view)();
+ void (*set_cursor_read_view)(void *);
+ void (*close_cursor_read_view)(void *);
+ handler *(*create)(TABLE *table);
+ void (*drop_database)(char* path);
+ int (*panic)(enum ha_panic_function flag);
+ int (*release_temporary_latches)(THD *thd);
+ int (*update_statistics)();
+ int (*start_consistent_snapshot)(THD *thd);
+ bool (*flush_logs)();
+ bool (*show_status)(THD *thd, stat_print_fn *print, enum ha_stat_type stat);
+ int (*repl_report_sent_binlog)(THD *thd, char *log_file_name,
+ my_off_t end_offset);
+ uint32 flags;
+ } handlerton;
+</programlisting>
+
+ <para>
+ The first element in the handlerton is the name of the storage
+ engine. This is the name that will be used when creating tables
+ (<literal>CREATE TABLE ... ENGINE =
+ <replaceable>FOO</replaceable>;</literal>).
+ </para>
+
+ <para>
+ The second element in the handlerton determines whether the
+ storage engine will be listed when using the <literal>SHOW STORAGE
+ ENGINES</literal> command.
+ </para>
+
+ <para>
+ The third element in the handlerton is the storage engine comment,
+ a description of the storage engine displayed when using the
+ <literal>SHOW STORAGE ENGINES</literal> command.
+ </para>
+
+ <remark>
+ [MH] TODO: BETTER GET THE RIGHT INFORMATION FOR THIS NEXT
+ PARAGRAPH. WE NEED TO KNOW THE HIGHEST VALUE A CONSTANT CAN BE AND
+ THE FILE WHERE THE CONSTANTS ARE DEFINED.
+ </remark>
+
+ <para>
+ The fourth element in the handlerton is an integer that uniquely
+ identifies the storage engine within the MySQL server. The
+ constants used by the build-in storage engines are defined in the
+ <filename>handler.h</filename> file. As an alternative to creating
+ a constant you can use an integer that is greater than 25.
+ </para>
+
+ <para>
+ The remaining sections of the handlerton are only implemented if
+ the functionality referred to is supported by a given storage
+ engine.
+ </para>
+
+ <para>
+ The fifth element in the handlerton is a function pointer to the
+ storage engine initializer. This function is only called once when
+ the server starts to allow the storage engine class to perform any
+ housekeeping that is necessary before handlers are instanced.
+ </para>
+
+ <para>
+ The sixth element in the handlerton is the slot. Each storage
+ engine has it's own memory area (actually a pointer) in the thd,
+ for storing per-connection information. It is accessed as
+ <literal>thd->ha_data[<replaceable>foo</replaceable>_hton.slot]</literal>.
+ The slot number is initialized by MySQL after
+ <literal><replaceable>foo</replaceable>_init()</literal> is
+ called.
+ </para>
+
+ <para>
+ The seventh element in the handlerton is the savepoint offset. To
+ store per-savepoint data the storage engine is provided with an
+ area of a requested size (0 is ok here).
+ </para>
+
+ <para>
+ The savepoint offset must be initialized statically to the size of
+ the needed memory to store per-savepoint information. After
+ <literal><replaceable>foo</replaceable>_init</literal> it is
+ changed to be an offset to the savepoint storage area and need not
+ be used by storage engine.
+ </para>
+
+ <para>
+ The eighth element in the handlerton is a function pointer to the
+ handler's close connection function. This is used by the NDB
+ storage engine to manage connections between the SQL nodes and the
+ storage nodes and should not be needed for most storage engines.
+ This function is only called if the slot is set to a non-zero
+ value.
+ </para>
+
+ <para>
+ The ninth element in the handlerton points to an uninitialized
+ storage area of requested size (see the savepoint offset
+ description)
+ </para>
+
+ <para>
+ The tenth element in the handlerton is a function pointer to the
+ handler's rollback to savepoint function. This is used to return
+ to a savepoint during a transaction. This is only populated for
+ storage engines that support savepoints.
+ </para>
+
+ <para>
+ The eleventh element in the handlerton is a function pointer to
+ the handler's release savepoint function. This is used to release
+ the resources of a savepoint during a transaction. This is only
+ populated for storage engines that support savepoints.
+ </para>
+
+ <para>
+ The twelfth element in the handlerton is a function pointer to the
+ handler's commit function. This is used to commit a transaction.
+ This is only populated for storage engines that support
+ transactions.
+ </para>
+
+ <para>
+ The thirteenth element in the handlerton is a function pointer to
+ the handler's rollback function. This is used to roll back a
+ transaction. This is only populated for storage engines that
+ support transactions.
+ </para>
+
</section>
+ <section id="custom-engine-instancing">
+
+ <title>Handling Handler Instancing</title>
+
+ <para>
+ The first method call your storage engine needs to support is the
+ call for a new handler instance.
+ </para>
+
+ <para>
+ Before the handlerton is defined in the storage engine source
+ file, a function header for the instancing function must be
+ defined. Here is an example from the <literal>CSV</literal>
+ engine:
+ </para>
+
+<programlisting>
+static handler* tina_create_handler(TABLE *table);
+</programlisting>
+
+ <para>
+ As you can see, the function accepts a pointer to the table the
+ handler is intended to manage and returns a handler object.
+ </para>
+
+ <para>
+ After the function header is defined, the function is named with a
+ function pointer in the twenty-first handlerton element,
+ identifying the function as being responsible for generating new
+ handler instances.
+ </para>
+
+ <para>
+ Here is an example of the <literal>MyISAM</literal> storage
+ engine's instancing function:
+ </para>
+
+<programlisting>
+static handler *myisam_create_handler(TABLE *table)
+ {
+ return new ha_myisam(table);
+ }
+</programlisting>
+
+ <para>
+ This call then works in conjunction with the storage engine's
+ constructor. Here is an example from the
+ <literal>FEDERATED</literal> storage engine:
+ </para>
+
+<programlisting>
+ha_federated::ha_federated(TABLE *table_arg)
+ :handler(&federated_hton, table_arg),
+ mysql(0), stored_result(0), scan_flag(0),
+ ref_length(sizeof(MYSQL_ROW_OFFSET)), current_position(0)
+ {}
+</programlisting>
+
+ <para>
+ And from the <literal>EXAMPLE</literal> storage engine:
+ </para>
+
+<programlisting>
+ha_example::ha_example(TABLE *table_arg)
+ :handler(&example_hton, table_arg)
+ {}
+</programlisting>
+
+ <para>
+ The additional elements in the <literal>FEDERATED</literal>
+ example are extra initializations for the handler. The minimum
+ implementation required is the handler() initialization shown in
+ the <literal>EXAMPLE</literal> version.
+ </para>
+
+ </section>
+
+ <section id="custom-engine-extensions">
+
+ <title>Defining Table Extensions</title>
+
+ <para>
+ Storage engines are required to provide the MySQL server with a
+ list of extensions used by the storage engine with regards to a
+ given table, its data and indexes.
+ </para>
+
+ <para>
+ Extensions are expected in the form of a null-terminated string
+ array. The following is the array used by the
+ <literal>CSV</literal> engine:
+ </para>
+
+<programlisting>
+static const char *ha_tina_exts[] = {
+ ".CSV",
+ NullS
+};
+</programlisting>
+
+ <para>
+ This array is returned when the <link linkend="custom-engine-api-reference-bas_ext"><literal>bas_ext()</literal></link> is
+ called:
+ </para>
+
+<programlisting>
+const char **ha_tina::bas_ext() const
+{
+ return ha_tina_exts;
+}
+</programlisting>
+
+ </section>
+
+ <section id="custom-engine-create-table">
+ <title>Creating Tables</title>
+
+ <para>Once a handler is instanced, the first operation that will
+ likely be required is the creation of a table.</para>
+
+ <para>Your storage engine must implement the <link linkend="custom-engine-api-reference-create">create</link> virtual
+ function:</para>
+
+<programlisting>
+virtual int create(const char *name, TABLE *form, HA_CREATE_INFO *info)=0;
+</programlisting>
+
+ <para>This function should create all necessary files and then close
+ the table. The MySQL server will call for the table to be opened
+ later on.</para>
+
+ <para>The <literal>*name</literal> parameter is the name of the
+ table. The <literal>*form</literal> parameter is a
+ <literal>st_table</literal> structure that defines the table and
+ matches the contents of the
+ <filename><replaceable>tablename</replaceable>.frm</filename> file
+ already created by the MySQL server. It is not necessary for a
+ storage engine to modify the
+ <filename><replaceable>tablename</replaceable>.frm</filename> file
+ is most cases and there is no pre-built functionality to support
+ doing so.</para>
+
+ <para>The <literal>*info</literal> parameter is a structure
+ containing information on the <literal>CREATE TABLE</literal>
+ statement used to create the table. The structure is defined in
+ <filename>handler.h</filename> and copied here for your
+ convenience:</para>
+
+<programlisting>
+typedef struct st_ha_create_information
+ {
+ CHARSET_INFO *table_charset, *default_table_charset;
+ LEX_STRING connect_string;
+ const char *comment,*password;
+ const char *data_file_name, *index_file_name;
+ const char *alias;
+ ulonglong max_rows,min_rows;
+ ulonglong auto_increment_value;
+ ulong table_options;
+ ulong avg_row_length;
+ ulong raid_chunksize;
+ ulong used_fields;
+ SQL_LIST merge_list;
+ enum db_type db_type;
+ enum row_type row_type;
+ uint null_bits; /* NULL bits at start of record */
+ uint options; /* OR of HA_CREATE_ options */
+ uint raid_type,raid_chunks;
+ uint merge_insert_method;
+ uint extra_size; /* length of extra data segment */
+ bool table_existed; /* 1 in create if table existed */
+ bool frm_only; /* 1 if no ha_create_table() */
+ bool varchar; /* 1 if table has a VARCHAR */
+ } HA_CREATE_INFO;
+</programlisting>
+
+ <para>A basic storage engine can ignore the contents of
+ <literal>*form</literal> and <literal>*info</literal>, as all this
+ is really required is the creation and possibly the initialization
+ of the data files used by the storage engine (assuming the storage
+ engine is file-based).</para>
+
+ <para>For example, here is the implementation from the
+ <literal>CSV</literal> storage engine:</para>
+
+<programlisting>
+int ha_tina::create(const char *name, TABLE *table_arg,
+ HA_CREATE_INFO *create_info)
+ {
+ char name_buff[FN_REFLEN];
+ File create_file;
+ DBUG_ENTER("ha_tina::create");
+
+ if ((create_file= my_create(fn_format(name_buff, name, "", ".CSV",
+ MY_REPLACE_EXT|MY_UNPACK_FILENAME),0,
+ O_RDWR | O_TRUNC,MYF(MY_WME))) < 0)
+ DBUG_RETURN(-1);
+
+ my_close(create_file,MYF(0));
+
+ DBUG_RETURN(0);
+ }
+</programlisting>
+
+ <para>In the preceding example, the <literal>CSV</literal> engine
+ does not refer at all to the <literal>*table_arg</literal> or
+ <literal>*create_info</literal> parameters, but simply creates the
+ required data files, closes them, and returns.</para>
+
+ <para>The <literal>my_create</literal> and
+ <literal>my_close</literal> functions are helper functions defined
+ in <filename>src/include/my_sys.h</filename>.</para>
+ </section>
+
+ <section id="custom-engine-open-table">
+ <title>Opening a Table</title>
+
+
+ </section>
+
+
<section id="custom-engine-table-scanning">
<title>Implementing Basic Table Scanning</title>
| Thread |
|---|
| • svn commit - mysqldoc@docsrva: r227 - branches/MikePluggable/trunk/refman-5.1 | mhillyer | 3 Nov |