Author: paul
Date: 2005-12-17 01:07:46 +0100 (Sat, 17 Dec 2005)
New Revision: 578
Log:
r4856@frost: paul | 2005-12-16 18:06:43 -0600
Plugin interface.
Modified:
trunk/
trunk/refman-4.1/manual.xml
trunk/refman-5.1/extending-mysql.xml
trunk/refman-common/titles.en.ent
Property changes on: trunk
___________________________________________________________________
Name: svk:merge
- b5ec3a16-e900-0410-9ad2-d183a3acac99:/mysqldoc-local/mysqldoc/trunk:4807
bf112a9c-6c03-0410-a055-ad865cd57414:/mysqldoc-local/mysqldoc/trunk:1694
+ b5ec3a16-e900-0410-9ad2-d183a3acac99:/mysqldoc-local/mysqldoc/trunk:4856
bf112a9c-6c03-0410-a055-ad865cd57414:/mysqldoc-local/mysqldoc/trunk:1694
Modified: trunk/refman-4.1/manual.xml
===================================================================
--- trunk/refman-4.1/manual.xml 2005-12-16 21:02:03 UTC (rev 577)
+++ trunk/refman-4.1/manual.xml 2005-12-17 00:07:46 UTC (rev 578)
@@ -18,8 +18,8 @@
<abstract>
<para>
- This is the MySQL Reference Manual. It documents MySQL 3.23 to
- MySQL ¤t-version;.
+ This is the MySQL Reference Manual. It documents MySQL 3.23
+ through MySQL ¤t-version;.
</para>
<para>
Modified: trunk/refman-5.1/extending-mysql.xml
===================================================================
--- trunk/refman-5.1/extending-mysql.xml 2005-12-16 21:02:03 UTC (rev 577)
+++ trunk/refman-5.1/extending-mysql.xml 2005-12-17 00:07:46 UTC (rev 578)
@@ -488,26 +488,32 @@
<secondary>adding</secondary>
</indexterm>
- <remark role="todo">
- Also to mention: Describe the interface functions.
- </remark>
-
<remark role="help-category" condition="Plugins"/>
<para>
- MySQL 5.1 and up provides a plugin interface that can be used to
- add new functions to the server. This interface is intended to be
- the successor to the older user-defined function (UDF) interface.
- The plugin interface eventually will include an API for creating
- UDFs, and it is intended this plugin UDF API will replace the
- older non-plugin UDF API. After that point, it will be possible
- for UDFs to be revised for use as plugin UDFs so that they can
- take advantage of the better security and versioning capabilities
- of the plugin API. Eventually, support for the older UDF API will
- be phased out.
+ MySQL 5.1 and up supports a plugin API that allows the loading and
+ unloading of server components at runtime, without restarting the
+ server. Currently, the plugin API supports creation of full-text
+ parser plugins. Such a plugin can be used to replace or augment
+ the built-in full-text parser. For example, a plugin can parse
+ text into words using rules that differ from the rules used by the
+ built-in parser. This could be useful if you need to parse text
+ with different characteristics than those expected by the built-in
+ parser.
</para>
<para>
+ The plugin interface is intended as the successor to the older
+ user-defined function (UDF) interface. The plugin interface
+ eventually will include an API for creating UDFs, and it is
+ intended this plugin UDF API will replace the older non-plugin UDF
+ API. After that point, it will be possible for UDFs to be revised
+ for use as plugin UDFs so that they can take advantage of the
+ better security and versioning capabilities of the plugin API.
+ Eventually, support for the older UDF API will be phased out.
+ </para>
+
+ <para>
The plugin interface requires the <literal>plugin</literal> table
in the <literal>mysql</literal> database. This table is created as
part of the MySQL installation process. If you are upgrading from
@@ -516,19 +522,333 @@
this table. See <xref linkend="upgrading-grant-tables"/>.
</para>
- <para>
- A plugin contains code that becomes part of the running server, so
- when you write a plugin, you are bound by any and all constraints
- that otherwise apply to writing server code. For example, you may
- have problems if you attempt to use functions from the
- <literal>libstdc++</literal> library. Note that these constraints
- may change in future versions of the server, so it is possible
- that server upgrades will require revisions to plugins that were
- originally written for older servers. For information about these
- constraints, see <xref linkend="configure-options"/> and
- <xref linkend="compilation-problems"/>.
- </para>
+ <section id="plugin-api-characteristics">
+ <title>&title-plugin-api-characteristics;</title>
+
+ <para>
+ In some respects, the plugin API is similar to the older
+ user-defined function (UDF) API that it supercedes, but the
+ plugin API has several advantages over the older interface:
+ </para>
+
+ <itemizedlist>
+
+ <listitem>
+ <para>
+ The plugin framework is extendable to accommodate different
+ kinds of plugins.
+ </para>
+
+ <para>
+ Some aspects of the plugin API are common to all types of
+ plugins, but the API also allows for type-specific interface
+ elements so that different types of plugins can be created.
+ A plugin with one purpose can have an interface most
+ appropriate to its own requirements and not the requirements
+ of some other plugin type.
+ </para>
+
+ <para>
+ Although only the interface for full-text parser plugins is
+ implemented currently, others can be added, such as an
+ interface for UDF plugins.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ The plugin API includes versioning information.
+ </para>
+
+ <para>
+ The version information included in the plugin API enables a
+ plugin library and each plugin that it contains to be
+ self-identifying with respect to the API version that was
+ used when the library was built. If the API changes over
+ time, the version numbers will change, but a server can
+ examine a given plugin library's version information to
+ determine whether it supports the plugins in the library.
+ </para>
+
+ <para>
+ There are two types of version numbers. The first is the
+ version for the general plugin framework itself. Each plugin
+ library includes this kind of version number. The second
+ type of version applies to individual plugins. Each specific
+ type of plugin has a version for its interface, so each
+ plugin in a library has a type-specific version number. For
+ example, library containing a full-text parsing plugin has a
+ general plugin API version number, and the plugin has a
+ version number for the full-text plugin interface.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ Plugin security is improved relative to the UDF interface.
+ </para>
+
+ <para>
+ The older interface for writing non-plugin UDFs allowed
+ libraries to be loaded from any directory searched by the
+ system's dynamic linker, and the symbols that identified the
+ UDF library were relatively non-specific. The newer rules
+ are more strict. A plugin library must be installed in a
+ specific dedicated directory for which the location is
+ controlled by the server and cannot be changed at runtime.
+ Also, the library must contain specific symbols that
+ identify it as a plugin library. The server will not load
+ something as a plugin if it was not built as a plugin.
+ </para>
+
+ <para>
+ The newer plugin interface eliminates the security issues of
+ the older UDF interface. When a UDF plugin type is
+ implemented, that will allow non-plugin UDFs to be brought
+ into the plugin framework and the older interface will be
+ phased out.
+ </para>
+ </listitem>
+
+ </itemizedlist>
+
+ <para>
+ The plugin implementation includes the following components:
+ </para>
+
+ <para>
+ Source files (the locations given indicate where the files are
+ found in a MySQL source distribution):
+ </para>
+
+ <itemizedlist>
+
+ <listitem>
+ <para>
+ <filename>include/plugin.h</filename> exposes the public
+ plugin API. This file should be examined by anyone who wants
+ to write a plugin library.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <filename>sql/sql_plugin.h</filename> and
+ <filename>sql/sql_plugin.cc</filename> comprise the internal
+ plugin implementation. These files need not be consulted by
+ plugin writers. They may be of interest for those who want
+ to know more about how the server handles plugins.
+ </para>
+ </listitem>
+
+ </itemizedlist>
+
+ <para>
+ System table:
+ </para>
+
+ <itemizedlist>
+
+ <listitem>
+ <para>
+ The <literal>plugin</literal> table in the
+ <literal>mysql</literal> database lists each installed
+ plugin and is required for plugin use. For new MySQL
+ installations, this table is created during the installation
+ process. If you are upgrading from a version older than
+ MySQL 5.1, you should run
+ <command>mysql_fix_privilege_tables</command> to update your
+ system tables and create the <literal>plugin</literal>
+ table.
+ </para>
+ </listitem>
+
+ </itemizedlist>
+
+ <para>
+ SQL statements:
+ </para>
+
+ <itemizedlist>
+
+ <listitem>
+ <para>
+ <literal>INSTALL PLUGIN</literal> registers a plugin in the
+ <literal>plugin</literal> table and loads the plugin code.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>UNINSTALL PLUGIN</literal> unregisters a plugin
+ from the <literal>plugin</literal> table and unloads the
+ plugin code.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ The <literal>WITH PARSER</literal> clause for full-text
+ index creation associates a full-text parser plugin with a
+ given <literal>FULLTEXT</literal> index.
+ </para>
+ </listitem>
+
+ </itemizedlist>
+
+ <para>
+ System variable:
+ </para>
+
+ <itemizedlist>
+
+ <listitem>
+ <para>
+ <literal>plugin_dir</literal> indicates the location of the
+ directory where all plugins (and UDFs) must be installed.
+ The value of this variable can be specified at server
+ startup with a
+ <option>--plugin_dir=<replaceable>path</replaceable></option>
+ option.
+ </para>
+ </listitem>
+
+ </itemizedlist>
+
+ </section>
+
+ <section id="plugin-full-text-plugins">
+
+ <title>&title-plugin-full-text-plugins;</title>
+
+ <para>
+ MySQL has a built-in parser that it uses by default for
+ full-text operations (parsing text to be indexed, or parsing a
+ query string to determine the terms to be used for a search).
+ For full-text processing, <quote>parsing</quote> means
+ extracting words from text or a query string based on rules that
+ define which character sequences make up a word and where word
+ boundaries lie.
+ </para>
+
+ <para>
+ When parsing for indexing purposes, the parser passes each word
+ to the server, which adds it to a full-text index. When parsing
+ a query string, the parser passes each word to the server, which
+ accumulates the words for use in a search.
+ </para>
+
+ <para>
+ The parsing properties of the built-in full-text parser are
+ described in <xref linkend="fulltext-search"/>. These properties
+ include rules for determining how to extract words from text.
+ The parser is influenced by certain system variables such as
+ <literal>ft_min_word_len</literal> and
+ <literal>ft_max_word_len</literal> that cause words shorter or
+ longer to be excluded, and by the stopword list that identifies
+ common words to be ignored.
+ </para>
+
+ <para>
+ The plugin API enables you to provide a full-text parser of your
+ own so that you have control over the basic duties of a parser.
+ A parser plugin can operate in either of two roles:
+ </para>
+
+ <itemizedlist>
+
+ <listitem>
+ <para>
+ The plugin can replace the built-in parser. In this role,
+ the plugin reads the input to be parsed, splits it up into
+ words, and passes the words to the server (either for
+ indexing or for word accumulation).
+ </para>
+
+ <para>
+ One reason to use a parser this way is that you need to use
+ different rules from those of the built-in parser for
+ determining how to split up input into words. For example,
+ the built-in parser considers the text
+ <quote>case-sensitive</quote> to consist of two words
+ <quote>case</quote> and <quote>sensitive</quote>, whereas an
+ application might need to treat the text as a single word.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ The plugin can act in conjunction with the built-in parser
+ by serving as a front end for it. In this role, the plugin
+ extracts text from the input and passes the text to the
+ parser, which splits up the text into words using its normal
+ parsing rules.
+ </para>
+
+ <para>
+ One reason to use a parser this way is that you need to
+ index content such as PDF documents, XML documents, or
+ <filename>.doc</filename> files. The built-in parser is not
+ intended for those types of input but a plugin can pull out
+ the text from these input sources and pass it to the
+ built-in parser.
+ </para>
+ </listitem>
+
+ </itemizedlist>
+
+ <para>
+ It is also possible for a parser plugin to operate in both
+ roles. That is, it could extract text from non-plaintext input
+ (the front-end role), and also parse the text into words (thus
+ replacing the built-in parser).
+ </para>
+
+ <para>
+ A full-text plugin is associated with full-text indexes on a
+ per-index basis. That is, when you install a parser plugin
+ initially, that does not cause it to be used for any full-text
+ operations. It simply becomes available. For example, a
+ full-text parser plugin becomes available to be named in a
+ <literal>WITH PARSER</literal> clause when creating individual
+ <literal>FULLTEXT</literal> indexes. To create such an index at
+ table-creation time, do this:
+ </para>
+
+<programlisting>
+CREATE TABLE t
+(
+ doc CHAR(255),
+ FULLTEXT INDEX (doc) WITH PARSER my_parser
+);
+</programlisting>
+
+ <para>
+ Or you can add the index after the table has been created:
+ </para>
+
+<programlisting>
+ALTER TABLE t ADD FULLTEXT INDEX (doc) WITH PARSER my_parser;
+</programlisting>
+
+ <para>
+ The only SQL change for associating the parser with the index is
+ the <literal>WITH PARSER</literal> clause. Searches are
+ specified as before, with no changes needed for queries.
+ </para>
+
+ <para>
+ When you associate a parser plugin with a
+ <literal>FULLTEXT</literal> index, the plugin is required for
+ using the index. If the parser plugin is dropped, any index
+ associated with it becomes unusable. Any attempt to use it a
+ table for which a plugin is not available results in an error,
+ although <literal>DROP TABLE</literal> is still possible.
+ </para>
+
+ </section>
+
<section id="install-plugin">
<title>&title-install-plugin;</title>
@@ -693,10 +1013,9 @@
<para>
<replaceable>plugin_name</replaceable> must be the name of some
- plugin that is listed in the
- <replaceable>mysql.plugin</replaceable> table. The server
- executes the plugin's termination (<literal>deinit()</literal>)
- function. It also removes the row for the plugin from the
+ plugin that is listed in the <literal>mysql.plugin</literal>
+ table. The server executes the plugin's deinitialization
+ function and removes the row for the plugin from the
<literal>mysql.plugin</literal> table, so that subsequent server
restarts will not load and initialize the plugin.
<literal>UNINSTALL PLUGIN</literal> does not remove the plugin's
@@ -715,23 +1034,1131 @@
Plugin removal has implications for the use of associated
tables. For example, if a full-text parser plugin is associated
with a <literal>FULLTEXT</literal> index on the table,
- uninstalling the plugin makes the table unusable. The table
- cannot be opened, so you cannot drop the index for which the
- plugin is used. This means that uninstalling a plugin is
- something to do with care unless you do not care about the table
- contents. If you are uninstalling a plugin with no intention of
- reinstalling it later (for example, to update it with a new
- version), and you care about the table contents, you should dump
- the table with <command>mysqldump</command> and remove the
- <literal>WITH PARSER</literal> clause from the dumped
- <literal>CREATE TABLE</literal> statement so that you can reload
- the table later. If you do not care about the table,
- <literal>DROP TABLE</literal> can be used even if plugins
- associated with the table are missing.
+ uninstalling the plugin makes the table unusable. Any attempt to
+ access the table results in an error. The table cannot even be
+ opened, so you cannot drop an index for which the plugin is
+ used. This means that uninstalling a plugin is something to do
+ with care unless you do not care about the table contents. If
+ you are uninstalling a plugin with no intention of reinstalling
+ it later (for example, to update it with a new version), and you
+ care about the table contents, you should dump the table with
+ <command>mysqldump</command> and remove the <literal>WITH
+ PARSER</literal> clause from the dumped <literal>CREATE
+ TABLE</literal> statement so that you can reload the table
+ later. If you do not care about the table, <literal>DROP
+ TABLE</literal> can be used even if plugins associated with the
+ table are missing.
</para>
</section>
+ <section id="plugin-writing">
+
+ <title>&title-plugin-writing;</title>
+
+ <para>
+ This section describes the general and type-specific parts of
+ the plugin API. It also provides a step-by-step guide to
+ creating a plugin library.
+ </para>
+
+ <para>
+ You can write plugins in C or C++. Plugins are loaded and
+ unloaded dynamically, so your operating system must support
+ dynamic loading and you must have compiled
+ <command>mysqld</command> dynamically (not statically).
+ </para>
+
+ <para>
+ A plugin contains code that becomes part of the running server,
+ so when you write a plugin, you are bound by any and all
+ constraints that otherwise apply to writing server code. For
+ example, you may have problems if you attempt to use functions
+ from the <literal>libstdc++</literal> library. Note that these
+ constraints may change in future versions of the server, so it
+ is possible that server upgrades will require revisions to
+ plugins that were originally written for older servers. For
+ information about these constraints, see
+ <xref linkend="configure-options"/> and
+ <xref linkend="compilation-problems"/>.
+ </para>
+
+ <section id="plugin-api-general">
+
+ <title>&title-plugin-api-general;</title>
+
+ <para>
+ Every plugin must have a general plugin declaration. The
+ declaration corresponds to the
+ <literal>st_mysql_plugin</literal> structure in the
+ <filename>plugin.h</filename> file:
+ </para>
+
+<programlisting>
+struct st_mysql_plugin
+{
+ int type; /* the plugin type (a MYSQL_XXX_PLUGIN value) */
+ void *info; /* pointer to type-specific plugin descriptor */
+ const char *name; /* plugin name */
+ const char *author; /* plugin author (for SHOW PLUGINS) */
+ const char *descr; /* general descriptive text (for SHOW PLUGINS ) */
+ int (*init)(void); /* the function to invoke when plugin is loaded */
+ int (*deinit)(void); /* the function to invoke when plugin is unloaded */
+};
+</programlisting>
+
+ <para>
+ The <literal>st_mysql_plugin</literal> structure is common to
+ every type of plugin. Its members are used as follows:
+ </para>
+
+ <itemizedlist>
+
+ <listitem>
+ <para>
+ <literal>type</literal>
+ </para>
+
+ <para>
+ The plugin type. This must be one of the plugin-type
+ values from <filename>plugin.h</filename>. For a full-text
+ parser plugin, the <literal>type</literal> value is
+ <literal>MYSQL_FTPARSER_PLUGIN</literal>.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>info</literal>
+ </para>
+
+ <para>
+ A pointer to the descriptor for the plugin. Unlike the
+ general plugin declaration structure, this descriptor's
+ structure depends on the particular type of plugin. Each
+ descriptor has a version number that indicates the API
+ version for that type of plugin, plus any other members
+ needed. The descriptor for full-text plugins is described
+ in <xref linkend="plugin-api-type-specific"/>.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>name</literal>
+ </para>
+
+ <para>
+ The plugin name. This is the name that will be listed in
+ the <literal>plugin</literal> table and by which you refer
+ to the plugin in SQL statements such as <literal>INSTALL
+ PLUGIN</literal> and <literal>UNINSTALL PLUGIN</literal>.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>author</literal>
+ </para>
+
+ <para>
+ The plugin author. This can be whatever you like.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>desc</literal>
+ </para>
+
+ <para>
+ A general description of the plugin. This can be whatever
+ you like.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>init</literal>
+ </para>
+
+ <para>
+ A once-only initialization function. This is executed when
+ the plugin is loaded, which happens for <literal>INSTALL
+ PLUGIN</literal> or, for plugins listed in the
+ <literal>plugin</literal> table, at server startup. The
+ function takes no arguments. It returns 0 for success and
+ 1 for failure.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>deinit</literal>
+ </para>
+
+ <para>
+ A once-only deinitialization function. This is executed
+ when the plugin is unloaded, which happens for
+ <literal>UNINSTALL PLUGIN</literal> or, for plugins listed
+ in the <literal>plugin</literal> table, at server
+ shutdown. The function takes no arguments. It returns 0
+ for success and 1 for failure.
+ </para>
+ </listitem>
+
+ </itemizedlist>
+
+ <para>
+ The <literal>init</literal> and <literal>deinit</literal>
+ functions in the general plugin declaration are invoked only
+ when loading and unloading the plugin. They have nothing to do
+ with use of the plugin such as happens when a SQL statement
+ causes the plugin to be invoked.
+ </para>
+
+ <para>
+ If an <literal>init</literal> or <literal>deinit</literal>
+ function is unneeded for a plugin, it can be specified as 0 in
+ the <literal>st_mysql_plugin</literal> structure.
+ </para>
+
+ </section>
+
+ <section id="plugin-api-type-specific">
+
+ <title>&title-plugin-api-type-specific;</title>
+
+ <para>
+ In the <literal>st_mysql_plugin</literal> structure that
+ defines a plugin's general declaration, the
+ <literal>info</literal> member points to a type-specific
+ plugin descriptor. For a full-text parser plugin, the
+ descriptor corresponds to the
+ <literal>st_mysql_ftparser</literal> structure in the
+ <filename>plugin.h</filename> file:
+ </para>
+
+<programlisting>
+struct st_mysql_ftparser
+{
+ int interface_version;
+ int (*parse)(MYSQL_FTPARSER_PARAM *param);
+ int (*init)(MYSQL_FTPARSER_PARAM *param);
+ int (*deinit)(MYSQL_FTPARSER_PARAM *param);
+};
+</programlisting>
+
+ <para>
+ As shown by the structure definition, the descriptor has a
+ version number
+ (<literal>MYSQL_FTPARSER_INTERFACE_VERSION</literal> for
+ full-text parser plugins) and contains pointers to three
+ functions. Each each of these functions should point to a
+ function or be set to 0 if the function is not needed.
+ Normally, the <literal>parse</literal> member should be
+ non-zero or there is little reason for the plugin to exist.
+ </para>
+
+ <para>
+ A full-text parser plugin is used in two different contexts,
+ indexing and searching. In either context, the server calls
+ the initialization and deinitialization functions at the
+ beginning and end of processing each SQL statement that causes
+ the plugin to be invoked. During statement processing, the
+ main parsing function is called according to its use:
+ </para>
+
+ <itemizedlist>
+
+ <listitem>
+ <para>
+ For indexing, the server calls the parser for each column
+ value to be indexed.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ For searching, the parser is called to parse the search
+ string. It might also be called for rows processed by the
+ statement. In natural language mode, there is no need for
+ the server to call the parser. For boolean mode phrase
+ searches or natural language searches with query
+ expansion, the parser is used to parse column values for
+ information that is not in the index. Also, if a boolean
+ mode search is done for a column that has no
+ <literal>FULLTEXT</literal> index, the built-in parser
+ will be called. (Plugins are associated with specific
+ indexes. If there is no index, no plugin is used.)
+ </para>
+ </listitem>
+
+ </itemizedlist>
+
+ <para>
+ Each interface function named in the plugin descriptor should
+ return 0 for success or 1 for failure, and each of them
+ receives an argument that points to a
+ <literal>MYSQL_FTPARSER_PARAM</literal> structure containing
+ the parsing context. The structure has this definition:
+ </para>
+
+<programlisting>
+typedef struct st_mysql_ftparser_param
+{
+ int (*mysql_parse)(void *param, byte *doc, uint doc_len);
+ int (*mysql_add_word)(void *param, byte *word, uint word_len,
+ MYSQL_FTPARSER_BOOLEAN_INFO *boolean_info);
+ void *ftparser_state;
+ void *mysql_ftparam;
+ CHARSET_INFO *cs;
+ byte *doc;
+ uint length;
+ int mode;
+} MYSQL_FTPARSER_PARAM;
+</programlisting>
+
+ <para>
+ The structure members are used as follows:
+ </para>
+
+ <itemizedlist>
+
+ <listitem>
+ <para>
+ <literal>mysql_parse</literal>
+ </para>
+
+ <para>
+ A pointer to a callback function that invokes the server's
+ built-in parser. Use this callback when the plugin acts as
+ a front-end to the built-in parser. When the plugin's
+ parsing function is called, it processes the input to
+ extract the text and passes the text to the
+ <literal>mysql_parse</literal> callback.
+ </para>
+
+ <para>
+ A front-end plugin can extract text and pass it all at
+ once to the built-in parser, or it can extract and pass
+ text to the built-in parser a piece at a time. However, in
+ this case, the built-in parser treats the pieces of text
+ as though there are implicit word breaks between them.
+ </para>
+
+ <para>
+ The first parameter for this callback function is the
+ <literal>mysql_ftparam</literal> member of the parsing
+ context structure. That is, if <literal>param</literal>
+ points to the structure, invoke the callback like this:
+ </para>
+
+<programlisting>
+param->mysql_parse(param->mysql_ftparam, ...);
+</programlisting>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>mysql_add_word</literal>
+ </para>
+
+ <para>
+ A pointer to a callback function that adds a word to a
+ full-text index or to the list of search terms. Use this
+ callback when the parser plugin acts as a replacement for
+ the built-in parser. When the plugin parsing function is
+ called, it parses the input into words and invokes the
+ <literal>mysql_add_word</literal> callback for each word.
+ </para>
+
+ <para>
+ The first parameter for this callback function is the
+ mysql_ftparam member of the parsing context structure.
+ That is, if <literal>param</literal> points to the
+ structure, invoke the callback like this:
+ </para>
+
+<programlisting>
+param->mysql_add_word(param->mysql_ftparam, ...);
+</programlisting>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>ftparser_state</literal>
+ </para>
+
+ <para>
+ This is a generic pointer. The plugin can set it to point
+ to information to be used internally for its own purposes.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>mysql_ftparam</literal>
+ </para>
+
+ <para>
+ This is set by the server. It is passed as the first
+ argument to the <literal>mysql_parse</literal> or
+ <literal>mysql_add_word</literal> callback. The plugin
+ should not modify it.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>cs</literal>
+ </para>
+
+ <para>
+ A pointer to information about the character set of the
+ text, or 0 if no information is available.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>doc</literal>
+ </para>
+
+ <para>
+ A pointer to the text to be parsed.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>length</literal>
+ </para>
+
+ <para>
+ The length of the text to be parsed, in bytes.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>mode</literal>
+ </para>
+
+ <para>
+ The parsing mode. This value will be one of the folowing
+ constants:
+ </para>
+
+ <itemizedlist>
+
+ <listitem>
+ <para>
+ <literal>MYSQL_FTPARSER_SIMPLE_MODE</literal>
+ </para>
+
+ <para>
+ Parse in fast and simple mode, which is used for
+ indexing and for natural language queries. The parser
+ should pass to the server only those words that should
+ be indexed. If the parser uses length limits or a
+ stopword list to determine which words to ignore, it
+ should not pass such words to the server.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>MYSQL_FTPARSER_WITH_STOPWORDS</literal>
+ </para>
+
+ <para>
+ Parse in stopword mode. This is used for boolean
+ searches for phrase matching. The parser should pass
+ all words to the server, even stopwords or words that
+ are outside any normal length limits.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>MYSQL_FTPARSER_FULL_BOOLEAN_INFO</literal>
+ </para>
+
+ <para>
+ Parse in boolean mode. This is used for parsing
+ boolean query strings. The parser should recognize not
+ only words but also boolean-mode operators and pass
+ them to the server as tokens via the
+ <literal>mysql_add_word</literal> callback. To tell
+ the server what kind of token is being passed, the
+ plugin needs to fill in a
+ <literal>MYSQL_FTPARSER_FULL_BOOLEAN_INFO</literal>
+ structure and pass a pointer to it.
+ </para>
+ </listitem>
+
+ </itemizedlist>
+ </listitem>
+
+ </itemizedlist>
+
+ <para>
+ If the parser is called in boolean mode, the
+ <literal>param->mode</literal> value will be
+ <literal>MYSQL_FTPARSER_FULL_BOOLEAN_INFO</literal>. The
+ <literal>MYSQL_FTPARSER_BOOLEAN_INFO</literal> structure that
+ the parser uses for passing token information to the server
+ looks like this:
+ </para>
+
+<programlisting>
+typedef struct st_mysql_ftparser_boolean_info
+{
+ enum enum_ft_token_type type;
+ int yesno;
+ int weight_adjust;
+ bool wasign;
+ bool trunc;
+ /* These are parser state and must be removed. */
+ byte prev;
+ byte *quot;
+} MYSQL_FTPARSER_BOOLEAN_INFO;
+</programlisting>
+
+ <para>
+ The parser should fill in the structure members as follows:
+ </para>
+
+ <itemizedlist>
+
+ <listitem>
+ <para>
+ <literal>type</literal>
+ </para>
+
+ <para>
+ The token type. This should be one of the following
+ values:
+ </para>
+
+ <itemizedlist>
+
+ <listitem>
+ <para>
+ <literal>FT_TOKEN_EOF</literal>: End of data
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>FT_TOKEN_WORD</literal>: A regular word
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>FT_TOKEN_LEFT_PAREN</literal>: The beginning
+ of a group or subexpression
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>FT_TOKEN_RIGHT_PAREN</literal>: The end of a
+ group or subexpression
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>FT_TOKEN_STOPWORD</literal>: A stopword
+ </para>
+ </listitem>
+
+ </itemizedlist>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>yesno</literal>
+ </para>
+
+ <para>
+ Whether the word must be present for a match to occur. 0
+ means that the word is optional but increases the match
+ relevance if it is present. Values larger than 0 mean that
+ the word must be present. Values smaller than 0 mean that
+ the word must not be present.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>weight_adjust</literal>
+ </para>
+
+ <para>
+ A weighting factor that determines how much a match for
+ the word counts. It can be used to increase or decrease
+ the word's importance in relevance calculations.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>wasign</literal>
+ </para>
+
+ <para>
+ The sign of the weighting factor.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>trunc</literal>
+ </para>
+
+ <para>
+ Whether matching should be done as if the
+ <literal>*</literal> truncation operator has been given.
+ </para>
+ </listitem>
+
+ </itemizedlist>
+
+ <para>
+ Plugins should not use the <literal>prev</literal> and
+ <literal>quot</literal> members of the
+ <literal>MYSQL_FTPARSER_BOOLEAN_INFO</literal> structure.
+ </para>
+
+ </section>
+
+ <section id="plugin-creating">
+
+ <title>&title-plugin-creating;</title>
+
+ <remark role="todo">
+ Mention the header files.
+ </remark>
+
+ <para>
+ This section describes a step-by-step procedure for creating a
+ plugin library that contains a full-text parsing plugin named
+ <literal>simple_parser</literal>. This plugin performs parsing
+ based on simpler rules than those used by the MySQL built-in
+ full-text parser: Words are non-empty runs of non-empty
+ whitespace characters.
+ </para>
+
+ <para>
+ Each plugin library has the following contents:
+ </para>
+
+ <itemizedlist>
+
+ <listitem>
+ <para>
+ A plugin library descriptor that indicates the version
+ number of the general plugin API that the library uses and
+ that contains a general declaration for each plugin in the
+ library.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ Each plugin general declaration contains information that
+ is common to all types of plugin: A value that indicates
+ the plugin type; the plugin name, author, and description;
+ and pointers to the initialization and deinitialization
+ functions that the server invokes when it loads and
+ unloads the plugin.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ The plugin general declaration also contains a pointer to
+ a type-specific plugin descriptor. The structure of these
+ descriptors can vary from one plugin type to another,
+ because each type of plugin can have its own API. A plugin
+ descriptor contains a type-specific API version number and
+ pointers to the functions that are needed to implement
+ that plugin type. For example, a full-text parser plugin
+ has initialization and deinitialization functions, and a
+ main parsing function. The server invokes these functions
+ when it uses the plugin to parse text.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ The plugin library contains the interface functions that
+ are referenced by the library descriptor and by the plugin
+ descriptors.
+ </para>
+ </listitem>
+
+ </itemizedlist>
+
+ <para>
+ To create a plugin library, follow these steps:
+ </para>
+
+ <orderedlist>
+
+ <listitem>
+ <para>
+ Set up the plugin library file descriptor.
+ </para>
+
+ <para>
+ Every plugin library includes a library descriptor that
+ must define two symbols:
+ </para>
+
+ <itemizedlist>
+
+ <listitem>
+ <para>
+ <literal>_mysql_plugin_interface_version_</literal>
+ specifies the version number of the general plugin
+ framework. This is given by the
+ <literal>MYSQL_PLUGIN_INTERFACE_VERSION</literal>
+ symbol, which is defined in the
+ <filename>plugin.h</filename> file.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>_mysql_plugin_declarations_</literal> defines
+ an array of plugin declarations, terminated by a
+ declaration with all members set to 0. Each
+ declaration is an instance of the
+ <literal>st_mysql_plugin</literal> structure (also
+ defined in <filename>plugin.h</filename>). There must
+ be one of these for each plugin in the library.
+ </para>
+ </listitem>
+
+ </itemizedlist>
+
+ <para>
+ If the server does not find these two symbols in a
+ library, it does not accept it as a legal plugin library
+ and rejects it with an error. This prevents use of a
+ library for plugin purposes unless it was built
+ specifically as a plugin library.
+ </para>
+
+ <para>
+ The standard (and most convenient) way to define the two
+ required symbols is by using the
+ <literal>mysql_declare_plugin</literal> and
+ <literal>mysql_declare_plugin_end</literal> macros from
+ the <filename>plugin.h</filename> file:
+ </para>
+
+<programlisting>
+mysql_declare_plugin
+ <replaceable>... one or more plugin declarations here ...</replaceable>
+mysql_declare_plugin_end;
+</programlisting>
+
+ <para>
+ For example, the library descriptor for a library that
+ contains a single plugin named
+ <literal>simple_parser</literal> looks like this:
+ </para>
+
+<programlisting>
+mysql_declare_plugin
+{
+ MYSQL_FTPARSER_PLUGIN, /* type */
+ &simple_parser_descriptor, /* descriptor */
+ "simple_parser", /* name */
+ "MySQL AB", /* author */
+ "Simple Full-Text Parser", /* description */
+ simple_parser_plugin_init, /* initialization function */
+ simple_parser_plugin_deinit /* deinitialization function */
+}
+mysql_declare_plugin_end;
+</programlisting>
+
+ <para>
+ For a full-text parser plugin, the type must be
+ <literal>MYSQL_FTPARSER_PLUGIN</literal>. This is the
+ value that identifies the plugin as being legal for use in
+ a <literal>WITH PARSER</literal> clause when creating a
+ <literal>FULLTEXT</literal> index. (No other plugin type
+ is legal for this clause.)
+ </para>
+
+ <para>
+ The <literal>mysql_declare_plugin</literal> and
+ <literal>mysql_declare_plugin_end</literal> macros are
+ defined in <filename>plugin.h</filename> like this:
+ </para>
+
+<programlisting>
+#define mysql_declare_plugin \
+int _mysql_plugin_interface_version_= MYSQL_PLUGIN_INTERFACE_VERSION; \
+struct st_mysql_plugin _mysql_plugin_declarations_[]= {
+#define mysql_declare_plugin_end ,{0,0,0,0,0,0,0}}
+</programlisting>
+
+ <para>
+ When the macros are used as just shown, they expand to the
+ following code, which defines both of the required symbols
+ (<literal>_mysql_plugin_interface_version_</literal> and
+ <literal>_mysql_plugin_declarations_</literal>):
+ </para>
+
+<programlisting>
+int _mysql_plugin_interface_version_= MYSQL_PLUGIN_INTERFACE_VERSION;
+struct st_mysql_plugin _mysql_plugin_declarations_[]= {
+{
+ MYSQL_FTPARSER_PLUGIN, /* type */
+ &simple_parser_descriptor, /* descriptor */
+ "simple_parser", /* name */
+ "MySQL AB", /* author */
+ "Simple Full-Text Parser", /* description */
+ simple_parser_plugin_init, /* initialization function */
+ simple_parser_plugin_deinit /* deinitialization function */
+}
+ ,{0,0,0,0,0,0,0}
+};
+</programlisting>
+ </listitem>
+
+ <listitem>
+ <para>
+ Set up the plugin descriptor.
+ </para>
+
+ <para>
+ Each plugin declaration in the library descriptor points
+ to a type-specific descriptor for the corresponding
+ plugin. In the <literal>simple_parser</literal>
+ declaration, that descriptor is indicated by
+ <literal>&simple_parser_descriptor</literal>. The
+ descriptor specifies the version number for the full-text
+ plugin interface (as given by
+ <literal>MYSQL_FTPARSER_INTERFACE_VERSION</literal>), and
+ the plugin's parsing, initialization, and deinitialization
+ functions:
+ </para>
+
+<programlisting>
+static struct st_mysql_ftparser simple_parser_descriptor=
+{
+ MYSQL_FTPARSER_INTERFACE_VERSION, /* interface version */
+ simple_parser_parse, /* parsing function */
+ simple_parser_init, /* parser init function */
+ simple_parser_deinit /* parser deinit function */
+};
+</programlisting>
+ </listitem>
+
+ <listitem>
+ <para>
+ Set up the functions for each plugin.
+ </para>
+
+ <para>
+ The general plugin declaration in the library descriptor
+ names the initialization and deinitialization functions
+ that the server should invoke when it loads and unloads
+ the plugin. For <literal>simple_parser</literal>, these
+ functions do nothing but return 0 to indicate that they
+ succeeded:
+ </para>
+
+<programlisting>
+static int simple_parser_plugin_init(void)
+{
+ return(0);
+}
+
+static int simple_parser_plugin_deinit(void)
+{
+ return(0);
+}
+</programlisting>
+
+ <para>
+ Because those functions do not actually do anything, you
+ could omit them and specify 0 for each of them in the
+ plugin declaration.
+ </para>
+
+ <para>
+ The type-specific plugin descriptor for
+ <literal>simple_parser</literal> names the initialization,
+ deinitialization, and parsing functions that the server
+ invokes when the plugin is used. The initialization and
+ deinitialization functions do nothing:
+ </para>
+
+<programlisting>
+static int simple_parser_init(MYSQL_FTPARSER_PARAM *param)
+{
+ return(0);
+}
+
+static int simple_parser_deinit(MYSQL_FTPARSER_PARAM *param)
+{
+ return(0);
+}
+</programlisting>
+
+ <para>
+ Here too, because those functions do nothing, you could
+ omit them and specify 0 for each of them in the plugin
+ descriptor.
+ </para>
+
+ <para>
+ The main parsing function,
+ <literal>simple_parser_parse()</literal>, acts as a
+ replacement for the built-in full-text parser, so it needs
+ to split text into words and pass each word to the server.
+ The parsing function's first argument is a pointer to a
+ structure that contains the parsing context. This
+ structure has a <literal>doc</literal> member that points
+ to the text to be parsed, and a <literal>length</literal>
+ member that indicates how long the text is. The simple
+ parsing done by the plugin considers non-empty runs of
+ non-empty whitespace characters to be words, so it
+ identifies words like this:
+ </para>
+
+<programlisting>
+static int simple_parser_parse(MYSQL_FTPARSER_PARAM *param)
+{
+ char *end, *start, *docend= param->doc + param->length;
+
+ for (end= start= param->doc;; end++)
+ {
+ if (end == docend)
+ {
+ if (end > start)
+ add_word(param, start, end - start);
+ break;
+ }
+ else if (isspace(*end))
+ {
+ if (end > start)
+ add_word(param, start, end - start);
+ start= end + 1;
+ }
+ }
+ return(0);
+}
+</programlisting>
+
+ <para>
+ As the parser finds each word, it invokes a function
+ <literal>add_word()</literal> to pass the word to the
+ server. <literal>add_word()</literal> is a helper function
+ only; it is not part of the plugin interface. The parser
+ passes the parsing context pointer to
+ <literal>add_word()</literal>, as well as a pointer to the
+ word and its length:
+ </para>
+
+<programlisting>
+static void add_word(MYSQL_FTPARSER_PARAM *param, char *word, size_t len)
+{
+ MYSQL_FTPARSER_BOOLEAN_INFO bool_info=
+ { FT_TOKEN_WORD, 0, 0, 0, 0, ' ', 0 };
+
+ if (param->mode & MYSQL_FTPARSER_FULL_BOOLEAN_INFO)
+ param->mysql_add_word(param->mysql_ftparam, word, len, &bool_info);
+ else
+ param->mysql_add_word(param->mysql_ftparam, word, len, 0);
+}
+</programlisting>
+ </listitem>
+
+ <listitem>
+ <para>
+ Compile the plugin library as a shared library and install
+ it in the plugin directory. The procedure for compiling
+ shared objects vary from system to system. If the library
+ is named <literal>mypluglib</literal>, you should end up
+ with a shared object file that has a name something like
+ <filename>libmypluglib.so</filename>. The filename might
+ have a different extension on your system.
+ </para>
+
+ <para>
+ The location of the plugin directory where you should
+ install the library is given by the
+ <literal>plugin_dir</literal> system variable. For
+ example:
+ </para>
+
+<programlisting>
+mysql> <userinput>SHOW VARIABLES LIKE 'plugin_dir';</userinput>
++---------------+----------------------------+
+| Variable_name | Value |
++---------------+----------------------------+
+| plugin_dir | /usr/local/mysql/lib/mysql |
++---------------+----------------------------+
+</programlisting>
+
+ <para>
+ When you install the plugin library, make sure that its
+ permissions are such that it is executable by the server.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ Register the plugin with the server.
+ </para>
+
+ <para>
+ The <literal>INSTALL PLUGIN</literal> statement causes the
+ server to list the plugin in the <literal>plugin</literal>
+ table and to load the plugin code from the library file.
+ Use that statement to register
+ <literal>simple_parser</literal> with the server, and then
+ verify that the plugin is listed in the
+ <literal>plugin</literal> table:
+ </para>
+
+<programlisting>
+mysql> <userinput>INSTALL PLUGIN simple_parser SONAME 'libmypluglib.so';</userinput>
+Query OK, 0 rows affected (0.00 sec)
+
+mysql> <userinput>SELECT * FROM mysql.plugin;</userinput>
++---------------+-----------------+
+| name | dl |
++---------------+-----------------+
+| simple_parser | libmypluglib.so |
++---------------+-----------------+
+1 row in set (0.00 sec)
+</programlisting>
+ </listitem>
+
+ <listitem>
+ <para>
+ Try the plugin.
+ </para>
+
+ <para>
+ Create a table that contains a string column and associate
+ the parser plugin with a <literal>FULLTEXT</literal> index
+ on the column:
+ </para>
+
+<programlisting>
+<!--
+mysql> DROP TABLE IF EXISTS t;
+Query OK, 0 rows affected (0.01 sec)
+-->
+mysql> <userinput>CREATE TABLE t (c VARCHAR(255),</userinput>
+ -> <userinput> FULLTEXT (c) WITH PARSER simple_parser);</userinput>
+Query OK, 0 rows affected (0.01 sec)
+</programlisting>
+
+ <para>
+ Insert some text into the table and try some searches.
+ These should verify that the parser treats all
+ non-whitespace characters as word characters:
+ </para>
+
+<programlisting>
+mysql> <userinput>INSERT INTO t VALUES</userinput>
+ -> <userinput> ('latin1_general_cs is a case-sensitive collation'),</userinput>
+ -> <userinput> ('I\'d like a case of oranges'),</userinput>
+ -> <userinput> ('this is sensitive information'),</userinput>
+ -> <userinput> ('another row'),</userinput>
+ -> <userinput> ('yet another row');</userinput>
+Query OK, 5 rows affected (0.02 sec)
+Records: 5 Duplicates: 0 Warnings: 0
+
+mysql> <userinput>SELECT c FROM t;</userinput>
++-------------------------------------------------+
+| c |
++-------------------------------------------------+
+| latin1_general_cs is a case-sensitive collation |
+| I'd like a case of oranges |
+| this is sensitive information |
+| another row |
+| yet another row |
++-------------------------------------------------+
+5 rows in set (0.00 sec)
+
+mysql> <userinput>SELECT MATCH(c) AGAINST('case') FROM t;</userinput>
++--------------------------+
+| MATCH(c) AGAINST('case') |
++--------------------------+
+| 0 |
+| 1.2968142032623 |
+| 0 |
+| 0 |
+| 0 |
++--------------------------+
+5 rows in set (0.00 sec)
+
+mysql> <userinput>SELECT MATCH(c) AGAINST('sensitive') FROM t;</userinput>
++-------------------------------+
+| MATCH(c) AGAINST('sensitive') |
++-------------------------------+
+| 0 |
+| 0 |
+| 1.3253291845322 |
+| 0 |
+| 0 |
++-------------------------------+
+5 rows in set (0.01 sec)
+
+mysql> <userinput>SELECT MATCH(c) AGAINST('case-sensitive') FROM t;</userinput>
++------------------------------------+
+| MATCH(c) AGAINST('case-sensitive') |
++------------------------------------+
+| 1.3109166622162 |
+| 0 |
+| 0 |
+| 0 |
+| 0 |
++------------------------------------+
+5 rows in set (0.01 sec)
+
+mysql> <userinput>SELECT MATCH(c) AGAINST('I\'d') FROM t;</userinput>
++--------------------------+
+| MATCH(c) AGAINST('I\'d') |
++--------------------------+
+| 0 |
+| 1.2968142032623 |
+| 0 |
+| 0 |
+| 0 |
++--------------------------+
+5 rows in set (0.01 sec)
+</programlisting>
+ </listitem>
+
+ </orderedlist>
+
+ <para>
+ Note how neither <quote>case</quote> nor
+ <quote>insensitive</quote> match
+ <quote>case-insensitive</quote> the way that they would for
+ the built-in parser.
+ </para>
+
+ </section>
+
+ </section>
+
</section>
<section id="adding-functions">
@@ -937,7 +2364,8 @@
<remark role="help-topic" condition="CREATE FUNCTION"/>
<remark role="help-keywords">
- AGGREGATE CREATE FUNCTION STRING REAL INTEGER DECIMAL RETURNS SONAME
+ AGGREGATE CREATE FUNCTION STRING REAL INTEGER DECIMAL RETURNS
+ SONAME
</remark>
<remark role="help-syntax-begin"/>
Modified: trunk/refman-common/titles.en.ent
===================================================================
--- trunk/refman-common/titles.en.ent 2005-12-16 21:02:03 UTC (rev 577)
+++ trunk/refman-common/titles.en.ent 2005-12-17 00:07:46 UTC (rev 578)
@@ -1332,6 +1332,12 @@
<!ENTITY title-pluggable-storage-transactions "Storage Engines and Transactions">
<!ENTITY title-pluggable-storage-unplugging "Unplugging a Storage Engine">
<!ENTITY title-plugin-api "The MySQL Plugin Interface">
+<!ENTITY title-plugin-api-characteristics "Characteristics of the Plugin Interface">
+<!ENTITY title-plugin-full-text-plugins "Full-Text Parser Plugins">
+<!ENTITY title-plugin-api-general "General Plugin Structures and Functions">
+<!ENTITY title-plugin-api-type-specific "Type-Specific Plugin Structures and Functions">
+<!ENTITY title-plugin-creating "Creating a Plugin Library">
+<!ENTITY title-plugin-writing "Writing Plugins">
<!ENTITY title-point-in-time-recovery "Point-in-Time Recovery">
<!ENTITY title-point-in-time-recovery-positions "Specifying Positions for Recovery">
<!ENTITY title-point-in-time-recovery-times "Specifying Times for Recovery">
| Thread |
|---|
| • svn commit - mysqldoc@docsrva: r578 - in trunk: . refman-4.1 refman-5.1 refman-common | paul | 17 Dec |