I would advise to differentiate between "prepared statements" and
"cached/precompiled statements".
Statement caching is a performance optimization that all databases use.
If a client submits a query, e.g., SELECT * FROM foo WHERE bar=2, and
then another similar query, SELECT * FROM foo WHERE bar=5, then
prepared statement cache allows treating the parsed, optimized, and
compiled statement as a parametrized one and re-executing it with a
new bound parameter value.
Beside this, there is a "Prepared Statements" feature, which is 100%
defined by the ANSI SQL Standard. This is NOT subject to
interpretation. Take the ANSI SQL-92 Standard, read it, and comply
with what it says.
There is another "prepared statement" semantic meaning/use - HA transactions.
In terms of statement caching, only expensive to compile and optimize
statements are worthy of caching. Certain DDL statements may fall
into this category. For example, suppose every 10 seconds you create
a table named as "weblog" concatenated with timestamp: CREATE TABLE
%tname AS (SELECT ... FROM binlog, translog, clickdump, users WHERE
mytimestamp BETWEEN %ts AND (%ts + 10sec) AND............). You may
want to reuse the "SELECT" part for this, even though this is a DDL
statement.
While these three feature sets may partially share internal
implementations, they are distinctly and fundamentally different,
especially when you consider all the related processing that has to
happen, such as privileges/security aspects, related statements
(FETCH, ...), etc. In MySQL, the three aspects are intertwined
internally, making it difficult to evolve. Perhaps, this is why the
Drizzle community has decided to rewrite rather than adapt these
features.
About how other databases deal with this.
Oracle has a statement cache feature. It is described in detail in
the Concepts manual. I believe, Oracle statement cache implementation
is distinctly separate from prepared statement and prepared
transaction implementation.
Postgres uses statement naming. Cached statements are the statements
that either the user or the server decides to name. Implicit
statement caching in Postgres works on a per-session basis. Explicit
prepared statements are implemented identically to caching, EXCEPT a
different type of memory context is used for saving the statements.
Unnamed statements are also cached, but the next non-matching unnamed
statement overwrites the cached version.
In your messages, you are discussing "prepared statements" in terms of
statement cache.
Regards,
Alex Esterkin
On Fri, May 15, 2009 at 10:58 AM, Roy Lyseng <Roy.Lyseng@stripped> wrote:
> Kostja,
>
> this is at least the same approach as we took with Clustra. When preparing a
> DDL statement, we simply analyzed the first two or three words of the query
> to find out the query type, and then saved the query text in a buffer
> associated with the statement.
>
> At execute time, the text was parsed and processed regularly.
>
> The obvious drawback is that any syntax errors will only be reported at
> execute time. And the error reported would sometimes be different from an
> error reported at prepare time.
>
> From an execution cost perspective I think it makes a lot of sense because
> DDL statements are usually much less frequent than DML statements, because
> the heavy part of DDL statements is execution, and because repeated
> execution of DDL is very rare.
>
> However, with a Statement object with a prepare() and an execute() method,
> you might get separate preparation and execution of DDL statements at a low
> cost...
>
> Thanks,
> Roy
>
> Konstantin Osipov wrote:
>>
>> Hi,
>>
>> There is a patch that was done slightly more than 6 months ago
>> that allows one to use the binary protocol for any kind of SQL
>> command in the server.
>>
>> It was done to implement the new service interface for backup
>> (WL#4264).
>>
>> Internally, the server parses and executes the query each time
>> it receives it, but uses the binary protocol to interoperate.
>>
>> Today's prepared statements support all DML commands
>> (INSERT,UPDATE,DELETE,SELECT,LOAD DATA), plus a number of DDL
>> commands and administrative statements, but not all of it.
>> The rest of DDL is prohibited from use.
>>
>> Moreover, placeholders can be met (and have practical value) only
>> in DML commands. There are no placeholders in ALTER TABLE or DROP
>> TABLE, and they don't make a lot of sense there.
>>
>> A simplified implementation of prepared statements could accept
>> all SQL commands, but only use real prepare for DML. For DDL, it could use
>> the execution path of WL#4264, thus still use the
>> binary protocol.
>>
>> There doesn't seem to be a lot of performance gain in preparing
>> DDL statements, and placeholders can't be met there anyway.
>>
>> Taking this mixed would let MySQL support all SQL commands in the
>> binary protocol. From the client perspective, all statements will
>> be "preparable", use of different execution paths will happen
>> transparently. Internally that'll let us spare
>> the DDL language implementation from complexity of having to take care of
>> re-execution.
>>
>> I'm curious to know how other databases (Postgres/Ingres/Firebird)
>> handle preparation of DDL statements. Do they do it at all? If
>> not, maybe MySQL shouldn't try to do it either, and rather focus
>> on improving prepared performance of important DML
>> (INSERT/UPDATE/DELETE/SELECT and LOAD DATA/CALL/DO).
>>
>
> --
> MySQL Internals Mailing List
> For list archives: http://lists.mysql.com/internals
> To unsubscribe:
> http://lists.mysql.com/internals?unsub=1
>
>