Hi Chuck,
I'm reading the code for packing/unpacking table column information in
log_event.{h,cc} and rpl_utility.{h,cc} and this is how I understand it:
1. Information is stored inside Table_map event in a fixed byte-order as defined
by Table_map_log_event::save_field_metadata().
2. In slave, this data is copied to m_field_metadata member of Table_map_event
as it is. Then it is parsed in the constructor of table_def class which fills
table_def::m_field_metadata member.
3. The table_def::m_field_metadata member is declared as an array of uint16 with
the intention that a single entry can contain either a (decoded) length of a
field or, for some field types, length together with other info in the high and
low byte (e.g. size and precision for decimals).
If I were writing this code, I would make the design in point 3 more explicit by
declaring m_field_metadata as an array of unions, like this:
union field_type_info {
uint16 length;
struct
{
uchar decimals;
uchar precision;
} new_decimal;
struct
{
uchar type;
uchar length;
} set_or_enum;
...
};
Then the code in table_def::calc_field_size could look like this:
uint32 table_def::calc_field_size(uint col, uchar *master_data) const
{
uint32 length;
field_type_info *info= &m_field_metadata[col];
switch (type(col)) {
case MYSQL_TYPE_NEWDECIMAL:
length= my_decimal_get_binary_size(info->new_decimal.decimals,
info->new_decimal.precision);
break;
case MYSQL_TYPE_DECIMAL:
case MYSQL_TYPE_FLOAT:
case MYSQL_TYPE_DOUBLE:
length= info->length;
break;
case MYSQL_TYPE_SET:
DBUG_ASSERT(info->set_or_enum.type == MYSQL_TYPE_SET);
length= info->set_or_enum.length;
break;
<... and so on>
Even if you don't want to change the code as I suggested, please fix the
following places in the current one.
In table_def constructor
------------------------
> table_def(field_type *types, ulong size, uchar *field_metadata,
> int metadata_size, uchar *null_bitmap)
> : m_size(size), m_type(0),
> m_field_metadata(0), m_null_bits(0), m_memory(NULL)
> {
> m_memory= (uchar *)my_multi_malloc(MYF(MY_WME),
> &m_type, size,
> &m_field_metadata,
> size * sizeof(uint16),
> &m_null_bits, (size + 7) / 8,
> NULL);
>
> bzero(m_field_metadata, size * sizeof(uint16));
>
> if (m_type)
> memcpy(m_type, types, size);
> else
> m_size= 0;
> /*
> Extract the data from the table map into the field metadata array
> iff there is field metadata. The variable metadata_size will be
> 0 if we are replicating from an older version server since no field
> metadata was written to the table map. This can also happen if
> there were no fields in the master that needed extra metadata.
> */
> if (m_size && metadata_size)
> {
> int index= 0;
> for (unsigned int i= 0; i < m_size; i++)
> {
> switch (m_type[i]) {
> case MYSQL_TYPE_TINY_BLOB:
> case MYSQL_TYPE_BLOB:
> case MYSQL_TYPE_MEDIUM_BLOB:
> case MYSQL_TYPE_LONG_BLOB:
> case MYSQL_TYPE_DOUBLE:
> case MYSQL_TYPE_FLOAT:
> {
> /*
> These types store a single byte.
> */
> m_field_metadata[i]= (uchar)field_metadata[index];
Cast not needed, field_metadata[index] *is* of type uchar. If any, the cast
should be to uint16 (but this is implicit).
> index++;
> break;
> }
> case MYSQL_TYPE_SET:
> case MYSQL_TYPE_ENUM:
> case MYSQL_TYPE_STRING:
> {
> short int x= field_metadata[index++] << 8U; // real_type
> x = x + field_metadata[index++]; // pack or field length
x should be declared unsigned since it is used as a two 8-bit values here. Why
short? - it only makes it look more dangerous. Best declare it as uint16, i.e.
the target type.
I would write "x = x | (0xFF & field_metadata[index++])" to make it more
explicit what is going on here.
> m_field_metadata[i]= x;
(
If my suggestion was implemented, this code would look as:
m_field_metadata[i].set_or_enum.type = field_metadata[index++];
m_field_metadata[i].set_or_enum.length = field_metadata[index++];
)
> break;
> }
> case MYSQL_TYPE_BIT:
> {
> short int x= field_metadata[index++];
> x = x + (field_metadata[index++] << 8U);
> m_field_metadata[i]= x;
As above.
> break;
> }
> case MYSQL_TYPE_VARCHAR:
> {
> /*
> These types store two bytes.
> */
> char *ptr= (char *)&field_metadata[index];
> m_field_metadata[i]= sint2korr(ptr);
I would use uint2korr here and add asserts in
Table_map_log_event::save_field_metadata() checking that a negative value is
never saved in the event.
If, for some reason, we allow negative values to be stored here, the code should
look as follows:
char *ptr= (char *)&field_metadata[index];
int16 x= sint2korr(ptr);
DBUG_ASSERT(x >= 0);
m_field_metadata[i]= x;
This is because m_field_metadata[i] is unsigned and you should never store a
negative value in it.
> index= index + 2;
> break;
> }
> case MYSQL_TYPE_NEWDECIMAL:
> {
> short int x= field_metadata[index++] << 8U; // precision
> x = x + field_metadata[index++]; // decimals
> m_field_metadata[i]= x;
As above, declare x to be uint16.
> break;
> }
> default:
> m_field_metadata[i]= 0;
> break;
> }
> }
> }
In calc_field_size()
--------------------
> uint32 table_def::calc_field_size(uint col, uchar *master_data) const
> {
> uint32 length;
>
> switch (type(col)) {
> case MYSQL_TYPE_NEWDECIMAL:
> length= my_decimal_get_binary_size(m_field_metadata[col] >> 8,
> m_field_metadata[col] - ((m_field_metadata[col] >> 8) <<
> 8));
Why "m_field_metadata[col] - ((m_field_metadata[col] >> 8) << 8)"? I thinks
this
is the same as "m_field_metadata[col]&0xFF" and the latter is much easier to
understand.
> break;
> case MYSQL_TYPE_DECIMAL:
> case MYSQL_TYPE_FLOAT:
> case MYSQL_TYPE_DOUBLE:
> length= m_field_metadata[col];
> break;
> case MYSQL_TYPE_SET:
> case MYSQL_TYPE_ENUM:
> case MYSQL_TYPE_STRING:
> {
> if (((m_field_metadata[col] & 0xff00) == (MYSQL_TYPE_SET << 8)) ||
> ((m_field_metadata[col] & 0xff00) == (MYSQL_TYPE_ENUM << 8)))
Or, "enum_field_types type= m_field_metadata[col]>>8U; if (type==MYSQL_TYPE_SET
|| type==MYSQL_TYPE_ENUM)" - saves few bit manipulations.
> length= m_field_metadata[col] & 0x00ff;
> else
Please put this else branch as separate case MYSQL_TYPE_STRING. It has virtually
no common code with the preceding two cases, especially after your patch which
removes "& 0x00ff" below.
> {
> length= m_field_metadata[col] & 0x00ff;
> DBUG_ASSERT(length > 0);
I would change this assert to "DBUG_ASSERT(length != 0)". Since length is
unsigned above formulation is confusing.
> if (length > 255)
> {
> DBUG_ASSERT(uint2korr(master_data) > 0);
> length= uint2korr(master_data) + 2;
To avoid double call to uint2korr:
length= uint2korr(master_data) + 2;
DBUG_ASSERT(length > 2);
(actually, one call is inside assert, but anyway...)
> }
> else
> length= (uint) *master_data + 1;
> }
Could you add a comment explaining why length is read twice here. As we
discussed, the value from m_field_metadata[col] could be the maximal length (as
declared in table definition), while the value read from master_data, the actual
length of the field.
> break;
> }
> case MYSQL_TYPE_YEAR:
> case MYSQL_TYPE_TINY:
> length= 1;
> break;
> case MYSQL_TYPE_SHORT:
> length= 2;
> break;
> case MYSQL_TYPE_INT24:
> length= 3;
> break;
> case MYSQL_TYPE_LONG:
> length= 4;
> break;
> #ifdef HAVE_LONG_LONG
> case MYSQL_TYPE_LONGLONG:
> length= 8;
> break;
> #endif
> case MYSQL_TYPE_NULL:
> length= 0;
> break;
> case MYSQL_TYPE_NEWDATE:
> length= 3;
> break;
> case MYSQL_TYPE_DATE:
> case MYSQL_TYPE_TIME:
> length= 3;
> break;
> case MYSQL_TYPE_TIMESTAMP:
> length= 4;
> break;
> case MYSQL_TYPE_DATETIME:
> length= 8;
> break;
> case MYSQL_TYPE_BIT:
> {
> uint from_len= (m_field_metadata[col] >> 8U) & 0x00ff;
> uint from_bit_len= m_field_metadata[col] & 0x00ff;
> DBUG_ASSERT(from_bit_len <= 7);
> length= from_len + ((from_bit_len > 0) ? 1 : 0);
Could you add some explanatory comment (I think I understand what is going on
here but would be good to explain it in a comment). For example document
from_len and from_bit_len variables.
> break;
> }
> case MYSQL_TYPE_VARCHAR:
> {
> length= m_field_metadata[col] > 255 ? 2 : 1; // c&p of
> Field_varstring::data_length()
> DBUG_ASSERT(uint2korr(master_data) > 0);
> length+= length == 1 ? (uint32) *master_data : uint2korr(master_data);
> break;
Looks the same as MYSQL_TYPE_STRING - why not handle them together.
> }
> case MYSQL_TYPE_TINY_BLOB:
> case MYSQL_TYPE_MEDIUM_BLOB:
> case MYSQL_TYPE_LONG_BLOB:
> case MYSQL_TYPE_BLOB:
> case MYSQL_TYPE_GEOMETRY:
> {
> #if 1
> /*
> BUG#29549:
> This is currently broken for NDB, which is using big-endian
> order when packing length of BLOB. Once they have decided how to
> fix the issue, we can enable the code below to make sure to
> always read the length in little-endian order.
> */
> Field_blob fb(m_field_metadata[col]);
> length= fb.get_packed_size(master_data, TRUE);
> #else
> /*
> Compute the length of the data. We cannot use get_length() here
> since it is dependent on the specific table (and also checks the
> packlength using the internal 'table' pointer) and replication
> is using a fixed format for storing data in the binlog.
> */
> switch (m_field_metadata[col]) {
> case 1:
> length= *master_data;
> break;
> case 2:
> length= sint2korr(master_data);
> break;
> case 3:
> length= uint3korr(master_data);
> break;
> case 4:
> length= uint4korr(master_data);
> break;
> default:
> DBUG_ASSERT(0); // Should not come here
> break;
Strange that sint*korr is sometimes used and sometimes uint*korr. Shouldn't it
be always uint*korr?
I also wonder why you sometimes use int*store/korr functions and sometimes code
numbers "by hand". Would be more consistent to use one of these methods. Using
int*store/korr functions has a drawback that if someone changes their
implementation the code will break (not able to correctly read old binlogs).
> }
>
> length+= m_field_metadata[col];
> #endif
> break;
> }
> default:
> length= -1;
> }
> return length;
> }
Best,
Rafal
cbell@stripped wrote:
> Below is the list of changes that have just been committed into a local
> 5.1 repository of cbell. When cbell does a push these changes will
> be propagated to the main repository and, within 24 hours after the
> push, to the public repository.
> For information on how to access the public repository
> see http://dev.mysql.com/doc/mysql/en/installing-source-tree.html
>
> ChangeSet@stripped, 2007-09-07 13:31:38-04:00, cbell@mysql_cab_desk. +2 -0
> BUG#30790 : Suspicious code in rpl_utility.cc
>
> This patch clarifies some of the coding choices with documentationa and
> removes a limitation in the code for future expansion of the CHAR and
> BINARY fields to length > 255.
>
> sql/log_event.cc@stripped, 2007-09-07 13:31:26-04:00, cbell@mysql_cab_desk. +10 -0
> BUG#30790 : Suspicious code in rpl_utility.cc
>
> This patch adds comments to help explain the choice of variable types.
>
> sql/rpl_utility.cc@stripped, 2007-09-07 13:31:26-04:00, cbell@mysql_cab_desk. +5 -1
> BUG#30790 : Suspicious code in rpl_utility.cc
>
> This patch removes the mask to allow length to be > 255. This is a
> future expansion measure since CHAR fields are limited to 255 characters.
> The code mirrors that of the packing methods in field.cc that are written
> to permit > 255 characters for CHAR -type fields should that change be
> made in the future.
>
> diff -Nrup a/sql/log_event.cc b/sql/log_event.cc
> --- a/sql/log_event.cc 2007-08-27 14:27:42 -04:00
> +++ b/sql/log_event.cc 2007-09-07 13:31:26 -04:00
> @@ -6469,6 +6469,16 @@ void Rows_log_event::print_helper(FILE *
> data) in the table map are initialized as zero (0). The array size is the
> same as the columns for the table on the slave.
>
> + Additionally, values saved for field metadata on the master are saved as a
> + string of bytes (uchar) in the binlog. A field may require 1 or more bytes
> + to store the information. In cases where values require multiple bytes
> + (e.g. values > 255), the endian-safe methods are used to properly encode
> + the values on the master and decode them on the slave. When the field
> + metadata values are captured on the slave, they are stored in an array of
> + type uint16. This allows the least number of casts to prevent casting bugs
> + when the field metadata is used in comparisons of field attributes. When
> + the field metadata is used for calculating addresses in pointer math, the
> + type used is uint32.
> */
>
> /**
> diff -Nrup a/sql/rpl_utility.cc b/sql/rpl_utility.cc
> --- a/sql/rpl_utility.cc 2007-08-27 07:43:58 -04:00
> +++ b/sql/rpl_utility.cc 2007-09-07 13:31:26 -04:00
> @@ -47,7 +47,11 @@ uint32 table_def::calc_field_size(uint c
> length= m_field_metadata[col] & 0x00ff;
> else
> {
> - length= m_field_metadata[col] & 0x00ff;
> + /*
> + Currently, MYSQL_TYPE_STRING is limited to 255 characters. The
> + following code is implemented to allow the expansion of that
> + restriction.
> + */
> DBUG_ASSERT(length > 0);
> if (length > 255)
> {
>
>