List:Internals« Previous MessageNext Message »
From:Michael Widenius Date:December 26 2002 4:21pm
Subject:Re: bk commit into 4.0 tree
View as plain text  
Hi!

Sorry for the very delayed response;  This email was in an email folder
that I don't check every day...

>>>>> "Tim" == Tim Bunce <Tim.Bunce@stripped> writes:

Tim> On Sun, Nov 10, 2002 at 09:39:02AM +0200, monty@stripped wrote:
>> +
>> +@itemize @bullet
>> +@item
>> +Warnings
>> +@item
>> +Prepared statements
>> +@item
>> +Binary protocol (will be much faster than the current protocol that
>> +converts everything to strings)

Tim> The word "much" probably overstates the effect. Faster yes, but "much" is
Tim> very relative and may raise expectations too high.

Sending/receiving data with the protocol will actually be much faster,
but of course the protocol itself is not everything that takes time.

I have now removed 'much' above.

>> +@multitable @columnfractions .10 .90
>> +@item Size @tab Comment
>> +@item 1	  @tab  0 ; Marker for ok package
>> +@item 1-9 @tab	Affected rows
>> +@item 1-9 @tab  Last insert id (0 if one wasn't used)
>> +@item 2   @tab  Server status; Can be used by client to check if we are inside
> an transaction
>> +@item 2	  @tab  Warning count

Tim> Capped at 65536? Could be higher after a load data command, for example.
Tim> Same applies to Warning count in end package. (BTW, most people would call
Tim> these 'packets' not 'packages'. Should at least clarify that near the top.)

Yes, capped at 65535.  You can get the exact error/warning count with:
SELECT @@error_count, @@warning_count;

The idea is to try to keep the number of bytes always sent between
server/client down to a minimum;  Later it's trivial to add an api
that does:

if (warning_count < 65535)
  return warning_count
else
  return mysql_select_result("SELECT @@error_count");

The server classifies things into 3 categories (NOTE, WARNING, ERROR)
and you will anyway want to ask the server for more information if
anything went very wrong.

Note also that, the server will itself in most cases cut the warning
after a given number of errors (default 64); (With a LOAD DATA problem
you are usually interested to find where the errors are to be able to
correct the data definition and do a reload than have to wade through
thousand of errors).

I have now updated the documentation to use packet instead of package.

>> +@item 1-9 @tab  Message length (optional)
>> +@item xxx @tab  Message (optional)
>> +@end multitable
>> +
>> +Size 1-9 means that the parameter is packed in to 1-9 bytes depending on
>> +the value.  (See function sql/net_pkg.cc::net_store_length).
>> +
>> +The message is optional.  For example for multi line INSERT it
>> +contains a string for how many rows was inserted / deleted.

Tim> Why send any text in the message at all for successful statements?
Tim> The "how many rows was inserted / deleted" for multi line INSERT,
Tim> for example, is already available in the Affected rows field.

Because affected rows doesn't tell everything (for example if you are
doing a REPLACE you may want to know how many rows are deleted).

It's only complex statements that may get a short info string, so I
don't think this is a notable overhead.

>> +@section 4.1 error package.
>> +
>> +The error package is sent when something goes wrong.
>> +The error package has the following structure:
>> +
>> +@item Size  @tab Comment
>> +@item 1	    @tab 255  Error package marker
>> +@item 1-255 @tab Null terminated error message

Tim> Is there a good reason for the error number not having a field?

Missed that in the documentation.  I have now added:

@item 2	    @tab Error code

>> +@section 4.1 prepared statement init package
>> +
>> +@item Size @tab Comment
>> +@item 4	   @tab Statement handler id
>> +@item 2	   @tab Number of columns in result set
>> +@item 2	   @tab Number of parameters in query
>> +
>> +After this, there is a packet that contains the following for each
>> +parameter in the query:
>> +
>> +@multitable @columnfractions .10 .90
>> +@item Size @tab Comment
>> +@item 2 @tab Enum value for field type. (MYSQL_TYPE_UNKNOWN if not known)
>> +@item 2 @tab 2 byte column flags (NOT_NULL_FLAG etc)
>> +@item 1 @tab Number of decimals
>> +@item 4 @tab Max column length.
>> +@end itemize
>> +
>> +Note that the above is not yet in 4.1 but will be added this month.
>> +
>> +As MySQL can have a parameter 'anywhere' it will in many cases not be
>> +able to provide the optimal information for all parameters.

Tim> What information is cached in the server with a prepared statement?

For the moment we are only doing a pre-parse and will reuse this
information for subsequent runs.

Tim> Can placeholders really be used *anywhere*? That implies that you're
Tim> not caching any meta-data associated with the prepared statement
Tim> (such as tables and grants) which implies that there little performance
Tim> benefit to be gained from using mysql prepared statements.

You can only have placeholders in places where you otherwise could
have a constant.  In other words, you can't have a placeholder for a
column or table name.

In the first version we will cache:

- Grants
- The parsing of the statement to meta data
- Initialisation and allocation of memory for all simple structures.

Later in 4.1 we will also do:

- Caching of pointer to fields/tables.
- Optimisation of WHERE expressions.
- Optimization rules (for things that are not parameters)

>> +@section 4.1 binary result set
>> +
>> +A binary result are sent the following way.
>> +
>> +For each result row:
>> +
>> +@itemize
>> +@item
>> +null bit map with first two bits set to 01 (bit 0,1 value 1)
>> +@item
>> +parameter data, repeated for each not null parameter.
>> +@end itemize

Tim> You refer to "parameter" here, do you mean "result field"?

Yes;  I have now updated this.

>> +The idea with the reserving two bits in the null map is that we can
>> +use standard error (first byte 255) and ok packages (first byte 0)
>> +to end a result sets.
>> +
>> +Except that the null-bit-map is shifted two steps, the server is
>> +sending the data to the client the same way that the server is sending
>> +bound parameters to the client.  The server is always sending the data
>> +as type given for 'column type' for respective column.  It's up to the
>> +client to convert the parameter to the requested type.

Tim> The protocol needs to allow room for expansion. Things like:

Tim> 	UPDATE table SET foo=foo+1 WHERE x=y RETURNING foo, bar

Tim> (returns a result set) and stored procedures which can return values
Tim> and/or update their parameters.

Returning a result set should be trivial, as this is what the protocol
already does.

We have not thought about letting the server update parameters.

Can you give me an example where we would need this ?

Thanks for all your comments!

Regards,
Monty
Thread
bk commit into 4.0 treemonty10 Nov
  • Re: bk commit into 4.0 treeTim Bunce14 Nov
    • Re: bk commit into 4.0 treeMichael Widenius26 Dec
      • Re: bk commit into 4.0 treeTim Bunce30 Dec