List:Commits« Previous MessageNext Message »
From:Mats Kindahl Date:December 10 2009 10:06am
Subject:Re: bzr commit into mysql-5.1 branch (mats:3180) WL#5151
View as plain text  

Alfranio Correia wrote:
> Hi Mats,
> 
> Excellent and impressive work.
> There are just a few details that need to be fixed before
> it is ready to be pushed.
> 
> STATUS
> ------
>   Not approved.
> 
> REQUIRED CHANGES
> ----------------
> 1 - Is there any particular reason to use the macro below?
> I think the pattern is to write the function name.
> 
> DBUG_ENTER(__FUNCTION__);

Ah, right. Yes, I use it quite frequently for debugging, but I missed removing
all of these in the final patch.

> 2 - Address the comments in the test case.
> 
> 3 - Address the comments in the code.
> 
> 
> REQUESTS
> --------
>   n/a
> 
> SUGGESTIONS
> -----------
> 
> 1 - In can_convert_field_to:
> 
> Would it be better (e.g. ieasier to maitain and read) to have a compatibility matrix
> where the safe combinations would be decided straightforward and those combinations
> requiring further processing will be checked by calling compatible_field_size() and
> any
> other auxiliar function?

The function compatible_field_size() can only be called when the real types are
the same and should not be called in any other case.

A matrix could be used to ensure the validity of the conversions. A
straightforward way of constructing such a one would be to have a matrix of
function pointers, where the presence of a pointer means that the function
should be called with the two types to decide validity. Such a matrix would be 4
* 256 * 256 = 256 KiB  and be very hard to maintain. Just keeping a single byte
would improve the size, but not make it easier to maintain. A switch is
(usually, but not always) slightly slower, but it is on the other hand easier to
maintain.

Hence I chose to use a nested switch.

The function is 2107 bytes large when compiled.

> DETAILS
> -------
> 
> 
> I am not the arch. reviewer but I have some doubts on the worklog.
> 
> 1 - If conversion is necessary, a fake table is created and then saved to be reused.
> I haven't seen any comment on how the fake table is invalidated.

The fake table live for the duration of the statement, and is the memory is
released when the statement ends.

> 2 - The definition of non-lossy and lossy for strings are hard to read:
> 
> - "Non-lossy conversions from STRING(N) to STRING'(M), where M > N or M = N and
> STRING != STRING', is supported if and only if ``ALL_NON_LOSSY`` is in
> ``SLAVE_TYPE_CONVERSIONS``."
> 
> - "Note that if M = N and STRING = STRING' it is not considered a non-lossy
> conversion, which means that if non-lossy conversion is not enabled, only
> replication between identical types is permitted."

Correction here: remove everything after the comma.

Fixed the worklog.

> - Lossy conversion from STRING(M) to STRING'(N), where M > N, is supported if and
> only if ``ALL_LOSSY`` is in ``SLAVE_TYPE_CONVERSIONS``.
> 
> - Note that the case when M = N and STRING = STRING' is not considered a lossy
> conversion.
> 
> Are those definitions contradictory, aren't they?

If you keep the stuff after the comma above, yes, otherwise, no.

If M = N and STRING = STRING' the conversion is neither lossy nor non-lossy (as
a matter of fact, it is not a conversion).

[snip]

>> --- a/mysql-test/extra/rpl_tests/check_type.inc	1970-01-01 00:00:00 +0000
>> +++ b/mysql-test/extra/rpl_tests/check_type.inc	2009-12-04 10:53:15 +0000
>> @@ -0,0 +1,50 @@
>> +# Helper file to perform one insert of a value into a table with
>> +# different types on master and slave.  The file will insert the
>> +# result into the type_conversions table *on the slave* to get a
>> +# summary of failing and succeeding tests.
>> +
>> +# Input:
>> +#    $source_type      Type on the master
>> +#    $target_type      Type on the slave
> 
>> +#    $source_value     Type on the master (inserted into the table)
> 
> s/Type/Value/
> 
>> +#    $target_value     Type on the slave (expected value in the table
>> +#                      on the slave)
> 
> s/Type/Value/

Fixed.

[snip]

>> === added file 'mysql-test/extra/rpl_tests/type_conversions.test'
>> --- a/mysql-test/extra/rpl_tests/type_conversions.test	1970-01-01 00:00:00 +0000
>> +++ b/mysql-test/extra/rpl_tests/type_conversions.test	2009-12-04 10:53:15 +0000

[snip]

>> +let $source_type= MEDIUMINT;
>> +let $target_type= TINYINT;
>> +let $source_value= 1;
>> +let $target_value= 1;
>> +let $can_convert  = $if_is_lossy;
>> +source extra/rpl_tests/check_type.inc;
>> +
>> +let $source_type= MEDIUMINT;
>> +let $target_type= SMALLINT;
>> +let $source_value= 1;
>> +let $target_value= 1;
>> +let $can_convert  = $if_is_lossy;
>> +source extra/rpl_tests/check_type.inc;
> 
> 
> I think we need to check also the highest values.
> Basicaly, any conversation that requeries the lossy option should
> check the corner cases.

OK, but I think it is sufficient to check for a value higher than the max for
the smaller type

[snip]

>> +let $source_type= VARCHAR(500);
>> +let $target_type= VARCHAR(255);
>> +let $source_value= '$blob';
>> +let $target_value= '$truncated_blob';
>> +let $can_convert = $if_is_lossy;
>> +source extra/rpl_tests/check_type.inc;
> 
> 
> I think we are missing some combinations:
> 
> TINYTEXT/TEXT,....

We are missing a lot of combinations. I have just added the ones that I see as
potential problems.

I'll add TINYTEXT <=> TEXT

[snip]

>> +let $source_type= BIT(12);
>> +let $target_type= BIT(5);
>> +let $source_value= b'101100111000';
>> +let $target_value= b'11111';
>> +let $can_convert = $if_is_lossy;
>> +source extra/rpl_tests/check_type.inc;
> 
> 
> I did not get the result for the bit type.

What did you get? This test passes on my machine, as you can see in the result file.

[snip]

>> +BIT(5)         	BIT(5)         	                         	<Correct value>
>> +BIT(5)         	BIT(6)         	                         	<Correct error>
>> +BIT(6)         	BIT(5)         	                         	<Correct error>
>> +BIT(5)         	BIT(12)        	                         	<Correct error>
>> +BIT(12)        	BIT(5)         	                         	<Correct error>

[snip]

>> +BIT(5)         	BIT(5)         	ALL_NON_LOSSY            	<Correct value>
>> +BIT(5)         	BIT(6)         	ALL_NON_LOSSY            	<Correct value>
>> +BIT(6)         	BIT(5)         	ALL_NON_LOSSY            	<Correct error>
>> +BIT(5)         	BIT(12)        	ALL_NON_LOSSY            	<Correct value>
>> +BIT(12)        	BIT(5)         	ALL_NON_LOSSY            	<Correct error>

[snip]

>> +BIT(5)         	BIT(5)         	ALL_LOSSY                	<Correct value>
>> +BIT(5)         	BIT(6)         	ALL_LOSSY                	<Correct error>
>> +BIT(6)         	BIT(5)         	ALL_LOSSY                	<Correct value>
>> +BIT(5)         	BIT(12)        	ALL_LOSSY                	<Correct error>
>> +BIT(12)        	BIT(5)         	ALL_LOSSY                	<Correct value>
[snip]

>> +BIT(5)         	BIT(5)         	ALL_LOSSY,ALL_NON_LOSSY  	<Correct value>
>> +BIT(5)         	BIT(6)         	ALL_LOSSY,ALL_NON_LOSSY  	<Correct value>
>> +BIT(6)         	BIT(5)         	ALL_LOSSY,ALL_NON_LOSSY  	<Correct value>
>> +BIT(5)         	BIT(12)        	ALL_LOSSY,ALL_NON_LOSSY  	<Correct value>
>> +BIT(12)        	BIT(5)         	ALL_LOSSY,ALL_NON_LOSSY  	<Correct value>

[snip]

>> +bool Field::compatible_field_size(uint field_metadata,
>> +                                  Relay_log_info *rli_arg
> __attribute__((unused)),
>> +                                  int *order_var)
>>  {
>>    uint const source_size= pack_length_from_metadata(field_metadata);
>>    uint const destination_size= row_pack_length();
>> -  return (source_size <= destination_size);
>> +  DBUG_PRINT("debug", ("real_type: %d, source_size: %u, destination_size: %u",
>> +                       real_type(), source_size, destination_size));
>> +  *order_var = compare(source_size, destination_size);
>> +  return true;
> 
> 
> --->

???

[snip]

>> @@ -7625,6 +7650,7 @@ Field_blob::Field_blob(uchar *ptr_arg, u
>>                   cs),
>>     packlength(blob_pack_length)
>>  {
>> +  DBUG_ASSERT(blob_pack_length <= 4); // Only pack lengths 1-4 supported
> currently
> 
> 
> Can you explain this?

All blobs have a pack length which is the number of bytes to store the length.
TINYTEXT and TINYBLOB for example, have a packlength of 1, meaning that they can
be max 255 bytes long.

Since we have no blobs with a packlength exceeding 4, and should they be added,
the server will be unable to handle it anyway, I added the assertion. I caught a
number of problems this way.

[snip]

>> +  uint row_pack_length() { return pack_length(); }
>> +  uint32 pack_length_from_metadata(uint field_metadata) {
>> +    uint32 length= pack_length();
>> +    DBUG_PRINT("result", ("pack_length_from_metadata(%d): %u",
>> +                          field_metadata, length));
>> +    return length;
>> +  }
> 
> Why do you need a different method? Is this just for debugging?

Field::pack_length_from_metadata() returns field_metadata by default, which does
not work. It need to return the pack length of the field.

[snip]

>> === modified file 'sql/mysqld.cc'
>> --- a/sql/mysqld.cc	2009-09-29 15:38:40 +0000
>> +++ b/sql/mysqld.cc	2009-12-04 10:53:15 +0000

[snip]

>> +  {"slave-type-conversions", OPT_SLAVE_TYPE_CONVERSIONS,
>> +   "Slave type conversions that are enabled. Legal values are"
>> +   " ALL_LOSSY to enable lossy conversions and"
>> +   " ALL_NON_LOSSY to enable non-lossy conversions.",
>> +   (uchar**) &slave_type_conversions_default,
>> +   (uchar**) &slave_type_conversions_default,
>> +   0, GET_STR, REQUIRED_ARG, 0, 0, 0, 0, 0, 0},
> 
> 
> I think you should add the case that the option is empty in order
> to have a description on how to force the slave to be strictly equal
> to the master.

Well... it is a set, so any combination of the flags are allowed, including the
empty set.

I'll clarify the documentation.

[snip]

>> === modified file 'sql/rpl_utility.cc'
>> --- a/sql/rpl_utility.cc	2008-06-30 20:11:18 +0000
>> +++ b/sql/rpl_utility.cc	2009-12-04 10:53:15 +0000
>> @@ -14,8 +14,178 @@
>>     Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
> */
>>  
>>  #include "rpl_utility.h"
>> +
>> +#ifndef MYSQL_CLIENT
>>  #include "rpl_rli.h"
>>  
>> +/**
>> +   Template function to compare two objects.
>> + */
>> +namespace {
>> +  template <class Type>
>> +  int compare(Type a, Type b)
>> +  {
>> +    if (a < b)
>> +      return -1;
>> +    if (b < a)
>> +      return 1;
>> +    return 0;
>> +  }
>> +}
> 
> Why don't you move the other compare template to this utility file?

Because there is no dependency between field.cc and rpl_utility.cc, so adding
one only to avoid repeating this simple template function is not a good idea.

>> +
>> +
>> +/**
>> +   Max value for an unsigned integer of 'bits' bits.
>> +
>> +   The somewhat contorted expression is to avoid overflow.
>> + */
>> +uint32 uint_max(int bits) {
>> +  return ((1UL << (bits - 1)) - 1) << 1 | 1;
>> +}
> 
> 
> Please, do what follows:
> 
> (((1UL << (bits - 1)) - 1) << 1) | 1;

Parantheses added.

[snip]

>> +  /*
>> +    If all conversions are disabled, it is not allowed to convert
>> +    between these types. Since the TEXT vs. BINARY is distinguished by
>> +    the charset, and the charset is not replication, we cannot
>> +    currently distinguish between , e.g., TEXT and BLOB.
> 
> s/is not replication/is not replicated/
> 

Fixed.

[snip]

>> +  /*
>> +    The types MYSQL_TYPE_DECIMAL and MYSQL_TYPE_VAR_STRING should not
>> +    appear inside the server at all.
>> +  */
> 
> 
> Can you elaborate on this?

MYSQL_TYPE_DECIMAL and MYSQL_TYPE_VAR_STRING are old types and cannot be
constructed in a new server, only by reading definitions from old frm files.
They are present in the static code, so they should probably be handled anyway.

Added code to handle them as well as can be done.

[snip]

>> +TABLE *table_def::create_conversion_table(THD *thd, Relay_log_info *rli, TABLE
> *target_table) const

[snip]

>> +    DBUG_PRINT("debug", ("sql_type: %d, max_length: %d, decimals: %d,"
>> +                         " maybe_null: %d, unsigned_flag: %d, pack_length: %u",
>> +                         type(col), max_length, decimals, TRUE, FALSE,
> pack_length));
>> +    field_def->init_for_tmp_table(type(col),
>> +                                  max_length,
>> +                                  decimals,
>> +                                  TRUE,         // maybe_null
>> +                                  FALSE,        // unsigned_flag
>> +                                  pack_length);
> 
> 
> Shouldn't you check if the field is unsigned? Null?

It is not possible to get information about whether the field is unsigned (there
is no information in the table map event about that) and using NULL is not
strictly necessary for the temporary table since no type comparison is made
between it and the target table.

I added the null flag anyway, since it is available.

[snip]

>> +table_def::table_def(unsigned char *types, ulong size,
>> +                     uchar *field_metadata, int metadata_size,
>> +                     uchar *null_bitmap)
> 
>> +  : m_size(size), m_type(0), m_field_metadata_size(metadata_size),
>> +    m_field_metadata(0), m_null_bits(0), m_memory(NULL)
> 
> Why do you initialize some fields in here if you do it explicetly below?

I just moved the constructor from the header file.

It is also useful to catch bugs, and the optimizer will eliminate the redundant
assignment anyway.

[snip]

>> +      case MYSQL_TYPE_BIT:
>> +      {
> 
> 
>> +        uint16 x= field_metadata[index++]; 
>> +        x = x + (field_metadata[index++] << 8U);
> 
> 
> Is this a mistake? Note that the most significative byte is in the second part of the
> array.
> In the other cases, it is always in the first part.

Eh, not sure I follow you.

This code is copied from the header file.

>> +        m_field_metadata[i]= x;
>> +        break;
>> +      }
>> +      case MYSQL_TYPE_VARCHAR:
>> +      {
> 
> 
>> +        /*
>> +          These types store two bytes.
>> +        */
>> +        char *ptr= (char *)&field_metadata[index];
>> +        m_field_metadata[i]= uint2korr(ptr);
>> +        index= index + 2;
> 
> 
> I was wondering if this may not generate any architecture issue.

Please elaborate.

This code was just copied from the header file.

[snip]

>> +
>> +table_def::~table_def()
>> +{
>> +  my_free(m_memory, MYF(0));
> 
>> +#ifndef DBUG_OFF
>> +  m_type= 0;
>> +  m_size= 0;
>> +#endif
> 
> What are you trying to prevent with this?

Discovering memory released twice in debug builds.

This code is copied from the header file.

[snip]

>> @@ -1261,7 +1294,6 @@ bool sys_var_thd_binlog_format::check(TH
>>    return result;
>>  }
>>  
>> -
>>  bool sys_var_thd_binlog_format::is_readonly() const
>>  {
>>    /*
>>
> 
> 
> Please, remove this hunk.

OK.

-- 
Mats Kindahl
Senior Software Engineer
Database Technology Group
Sun Microsystems
Thread
bzr commit into mysql-5.1 branch (mats:3180) WL#5151Mats Kindahl4 Dec
  • Re: bzr commit into mysql-5.1 branch (mats:3180) WL#5151Alfranio Correia10 Dec
    • Re: bzr commit into mysql-5.1 branch (mats:3180) WL#5151Sergei Golubchik10 Dec
    • Re: bzr commit into mysql-5.1 branch (mats:3180) WL#5151Mats Kindahl10 Dec
      • Re: bzr commit into mysql-5.1 branch (mats:3180) WL#5151Alfranio Correia10 Dec
        • Re: bzr commit into mysql-5.1 branch (mats:3180) WL#5151Mats Kindahl10 Dec
  • Re: bzr commit into mysql-5.1 branch (mats:3180) WL#5151Luís Soares10 Dec
    • Re: bzr commit into mysql-5.1 branch (mats:3180) WL#5151Mats Kindahl10 Dec