List:Commits« Previous MessageNext Message »
From:Alfranio Correia Date:December 10 2009 11:50am
Subject:Re: bzr commit into mysql-5.1 branch (mats:3180) WL#5151
View as plain text  
Hi Mats,

Thank you for the prompt reply.
See some comments in-line

Cheers.

Mats Kindahl wrote:
> 
> Alfranio Correia wrote:
>> Hi Mats,
>>
>> Excellent and impressive work.
>> There are just a few details that need to be fixed before
>> it is ready to be pushed.
>>
>> STATUS
>> ------
>>   Not approved.
>>
>> REQUIRED CHANGES
>> ----------------
>> 1 - Is there any particular reason to use the macro below?
>> I think the pattern is to write the function name.
>>
>> DBUG_ENTER(__FUNCTION__);
> 
> Ah, right. Yes, I use it quite frequently for debugging, but I missed removing
> all of these in the final patch.
> 
>> 2 - Address the comments in the test case.
>>
>> 3 - Address the comments in the code.
>>
>>
>> REQUESTS
>> --------
>>   n/a
>>
>> SUGGESTIONS
>> -----------
>>
>> 1 - In can_convert_field_to:
>>
>> Would it be better (e.g. ieasier to maitain and read) to have a compatibility
> matrix
>> where the safe combinations would be decided straightforward and those
> combinations
>> requiring further processing will be checked by calling compatible_field_size()
> and any
>> other auxiliar function?
> 
> The function compatible_field_size() can only be called when the real types are
> the same and should not be called in any other case.
> 
> A matrix could be used to ensure the validity of the conversions. A
> straightforward way of constructing such a one would be to have a matrix of
> function pointers, where the presence of a pointer means that the function
> should be called with the two types to decide validity. Such a matrix would be 4
> * 256 * 256 = 256 KiB  and be very hard to maintain. Just keeping a single byte
> would improve the size, but not make it easier to maintain. A switch is
> (usually, but not always) slightly slower, but it is on the other hand easier to
> maintain.


ok.

> 
> Hence I chose to use a nested switch.
> 
> The function is 2107 bytes large when compiled.
> 
>> DETAILS
>> -------
>>
>>
>> I am not the arch. reviewer but I have some doubts on the worklog.
>>
>> 1 - If conversion is necessary, a fake table is created and then saved to be
> reused.
>> I haven't seen any comment on how the fake table is invalidated.
> 
> The fake table live for the duration of the statement, and is the memory is
> released when the statement ends.
> 
>> 2 - The definition of non-lossy and lossy for strings are hard to read:
>>
>> - "Non-lossy conversions from STRING(N) to STRING'(M), where M > N or M = N
> and
>> STRING != STRING', is supported if and only if ``ALL_NON_LOSSY`` is in
>> ``SLAVE_TYPE_CONVERSIONS``."
>>
>> - "Note that if M = N and STRING = STRING' it is not considered a non-lossy
>> conversion, which means that if non-lossy conversion is not enabled, only
>> replication between identical types is permitted."
> 
> Correction here: remove everything after the comma.
> 
> Fixed the worklog.
> 
>> - Lossy conversion from STRING(M) to STRING'(N), where M > N, is supported if
> and
>> only if ``ALL_LOSSY`` is in ``SLAVE_TYPE_CONVERSIONS``.
>>
>> - Note that the case when M = N and STRING = STRING' is not considered a lossy
>> conversion.
>>
>> Are those definitions contradictory, aren't they?
> 
> If you keep the stuff after the comma above, yes, otherwise, no.
> 
> If M = N and STRING = STRING' the conversion is neither lossy nor non-lossy (as
> a matter of fact, it is not a conversion).
> 
> [snip]
> 
>>> --- a/mysql-test/extra/rpl_tests/check_type.inc	1970-01-01 00:00:00 +0000
>>> +++ b/mysql-test/extra/rpl_tests/check_type.inc	2009-12-04 10:53:15 +0000
>>> @@ -0,0 +1,50 @@
>>> +# Helper file to perform one insert of a value into a table with
>>> +# different types on master and slave.  The file will insert the
>>> +# result into the type_conversions table *on the slave* to get a
>>> +# summary of failing and succeeding tests.
>>> +
>>> +# Input:
>>> +#    $source_type      Type on the master
>>> +#    $target_type      Type on the slave
>>> +#    $source_value     Type on the master (inserted into the table)
>> s/Type/Value/
>>
>>> +#    $target_value     Type on the slave (expected value in the table
>>> +#                      on the slave)
>> s/Type/Value/
> 
> Fixed.
> 
> [snip]
> 
>>> === added file 'mysql-test/extra/rpl_tests/type_conversions.test'
>>> --- a/mysql-test/extra/rpl_tests/type_conversions.test	1970-01-01 00:00:00
> +0000
>>> +++ b/mysql-test/extra/rpl_tests/type_conversions.test	2009-12-04 10:53:15
> +0000
> 
> [snip]
> 
>>> +let $source_type= MEDIUMINT;
>>> +let $target_type= TINYINT;
>>> +let $source_value= 1;
>>> +let $target_value= 1;
>>> +let $can_convert  = $if_is_lossy;
>>> +source extra/rpl_tests/check_type.inc;
>>> +
>>> +let $source_type= MEDIUMINT;
>>> +let $target_type= SMALLINT;
>>> +let $source_value= 1;
>>> +let $target_value= 1;
>>> +let $can_convert  = $if_is_lossy;
>>> +source extra/rpl_tests/check_type.inc;
>>
>> I think we need to check also the highest values.
>> Basicaly, any conversation that requeries the lossy option should
>> check the corner cases.
> 
> OK, but I think it is sufficient to check for a value higher than the max for
> the smaller type

That's fine.

> 
> [snip]
> 
>>> +let $source_type= VARCHAR(500);
>>> +let $target_type= VARCHAR(255);
>>> +let $source_value= '$blob';
>>> +let $target_value= '$truncated_blob';
>>> +let $can_convert = $if_is_lossy;
>>> +source extra/rpl_tests/check_type.inc;
>>
>> I think we are missing some combinations:
>>
>> TINYTEXT/TEXT,....
> 
> We are missing a lot of combinations. I have just added the ones that I see as
> potential problems.

ok. Thanks.

> 
> I'll add TINYTEXT <=> TEXT
> 
> [snip]
> 
>>> +let $source_type= BIT(12);
>>> +let $target_type= BIT(5);
>>> +let $source_value= b'101100111000';
>>> +let $target_value= b'11111';
>>> +let $can_convert = $if_is_lossy;
>>> +source extra/rpl_tests/check_type.inc;
>>
>> I did not get the result for the bit type.
> 
> What did you get? This test passes on my machine, as you can see in the result file.
> 
> [snip]
> 
>>> +BIT(5)         	BIT(5)         	                         	<Correct
> value>
>>> +BIT(5)         	BIT(6)         	                         	<Correct
> error>
>>> +BIT(6)         	BIT(5)         	                         	<Correct
> error>
>>> +BIT(5)         	BIT(12)        	                         	<Correct
> error>
>>> +BIT(12)        	BIT(5)         	                         	<Correct
> error>
> 
> [snip]
> 
>>> +BIT(5)         	BIT(5)         	ALL_NON_LOSSY            	<Correct
> value>
>>> +BIT(5)         	BIT(6)         	ALL_NON_LOSSY            	<Correct
> value>
>>> +BIT(6)         	BIT(5)         	ALL_NON_LOSSY            	<Correct
> error>
>>> +BIT(5)         	BIT(12)        	ALL_NON_LOSSY            	<Correct
> value>
>>> +BIT(12)        	BIT(5)         	ALL_NON_LOSSY            	<Correct
> error>
> 
> [snip]
> 
>>> +BIT(5)         	BIT(5)         	ALL_LOSSY                	<Correct
> value>
>>> +BIT(5)         	BIT(6)         	ALL_LOSSY                	<Correct
> error>
>>> +BIT(6)         	BIT(5)         	ALL_LOSSY                	<Correct
> value>
>>> +BIT(5)         	BIT(12)        	ALL_LOSSY                	<Correct
> error>
>>> +BIT(12)        	BIT(5)         	ALL_LOSSY                	<Correct
> value>
> [snip]
> 
>>> +BIT(5)         	BIT(5)         	ALL_LOSSY,ALL_NON_LOSSY  	<Correct
> value>
>>> +BIT(5)         	BIT(6)         	ALL_LOSSY,ALL_NON_LOSSY  	<Correct
> value>
>>> +BIT(6)         	BIT(5)         	ALL_LOSSY,ALL_NON_LOSSY  	<Correct
> value>
>>> +BIT(5)         	BIT(12)        	ALL_LOSSY,ALL_NON_LOSSY  	<Correct
> value>
>>> +BIT(12)        	BIT(5)         	ALL_LOSSY,ALL_NON_LOSSY  	<Correct
> value>
> 
> [snip]
> 
>>> +bool Field::compatible_field_size(uint field_metadata,
>>> +                                  Relay_log_info *rli_arg
> __attribute__((unused)),
>>> +                                  int *order_var)
>>>  {
>>>    uint const source_size= pack_length_from_metadata(field_metadata);
>>>    uint const destination_size= row_pack_length();
>>> -  return (source_size <= destination_size);
>>> +  DBUG_PRINT("debug", ("real_type: %d, source_size: %u, destination_size:
> %u",
>>> +                       real_type(), source_size, destination_size));
>>> +  *order_var = compare(source_size, destination_size);
>>> +  return true;


Sorry. Forget about it.

>>
>> --->
> 
> ???
> 
> [snip]
> 
>>> @@ -7625,6 +7650,7 @@ Field_blob::Field_blob(uchar *ptr_arg, u
>>>                   cs),
>>>     packlength(blob_pack_length)
>>>  {
>>> +  DBUG_ASSERT(blob_pack_length <= 4); // Only pack lengths 1-4 supported
> currently
>>
>> Can you explain this?
> 
> All blobs have a pack length which is the number of bytes to store the length.
> TINYTEXT and TINYBLOB for example, have a packlength of 1, meaning that they can
> be max 255 bytes long.
> 
> Since we have no blobs with a packlength exceeding 4, and should they be added,
> the server will be unable to handle it anyway, I added the assertion. I caught a
> number of problems this way.
> 
> [snip]
> 
>>> +  uint row_pack_length() { return pack_length(); }
>>> +  uint32 pack_length_from_metadata(uint field_metadata) {
>>> +    uint32 length= pack_length();
>>> +    DBUG_PRINT("result", ("pack_length_from_metadata(%d): %u",
>>> +                          field_metadata, length));
>>> +    return length;
>>> +  }
>> Why do you need a different method? Is this just for debugging?
> 
> Field::pack_length_from_metadata() returns field_metadata by default, which does
> not work. It need to return the pack length of the field.


Both row_pack_length and pack_length_from_metadata() rely on the pack_length to
return the value. The difference is the parameter in pack_length_from_metadata()
which is only used for debugging.

> 
> [snip]
> 
>>> === modified file 'sql/mysqld.cc'
>>> --- a/sql/mysqld.cc	2009-09-29 15:38:40 +0000
>>> +++ b/sql/mysqld.cc	2009-12-04 10:53:15 +0000
> 
> [snip]
> 
>>> +  {"slave-type-conversions", OPT_SLAVE_TYPE_CONVERSIONS,
>>> +   "Slave type conversions that are enabled. Legal values are"
>>> +   " ALL_LOSSY to enable lossy conversions and"
>>> +   " ALL_NON_LOSSY to enable non-lossy conversions.",
>>> +   (uchar**) &slave_type_conversions_default,
>>> +   (uchar**) &slave_type_conversions_default,
>>> +   0, GET_STR, REQUIRED_ARG, 0, 0, 0, 0, 0, 0},
>>
>> I think you should add the case that the option is empty in order
>> to have a description on how to force the slave to be strictly equal
>> to the master.
> 
> Well... it is a set, so any combination of the flags are allowed, including the
> empty set.
> 
> I'll clarify the documentation.

ok.

> 
> [snip]
> 
>>> === modified file 'sql/rpl_utility.cc'
>>> --- a/sql/rpl_utility.cc	2008-06-30 20:11:18 +0000
>>> +++ b/sql/rpl_utility.cc	2009-12-04 10:53:15 +0000
>>> @@ -14,8 +14,178 @@
>>>     Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA  02110-1301 
> USA */
>>>  
>>>  #include "rpl_utility.h"
>>> +
>>> +#ifndef MYSQL_CLIENT
>>>  #include "rpl_rli.h"
>>>  
>>> +/**
>>> +   Template function to compare two objects.
>>> + */
>>> +namespace {
>>> +  template <class Type>
>>> +  int compare(Type a, Type b)
>>> +  {
>>> +    if (a < b)
>>> +      return -1;
>>> +    if (b < a)
>>> +      return 1;
>>> +    return 0;
>>> +  }
>>> +}
>> Why don't you move the other compare template to this utility file?
> 
> Because there is no dependency between field.cc and rpl_utility.cc, so adding
> one only to avoid repeating this simple template function is not a good idea.

ok. I am fine.

> 
>>> +
>>> +
>>> +/**
>>> +   Max value for an unsigned integer of 'bits' bits.
>>> +
>>> +   The somewhat contorted expression is to avoid overflow.
>>> + */
>>> +uint32 uint_max(int bits) {
>>> +  return ((1UL << (bits - 1)) - 1) << 1 | 1;
>>> +}
>>
>> Please, do what follows:
>>
>> (((1UL << (bits - 1)) - 1) << 1) | 1;
> 
> Parantheses added.
> 
> [snip]
> 
>>> +  /*
>>> +    If all conversions are disabled, it is not allowed to convert
>>> +    between these types. Since the TEXT vs. BINARY is distinguished by
>>> +    the charset, and the charset is not replication, we cannot
>>> +    currently distinguish between , e.g., TEXT and BLOB.
>> s/is not replication/is not replicated/
>>
> 
> Fixed.
> 
> [snip]
> 
>>> +  /*
>>> +    The types MYSQL_TYPE_DECIMAL and MYSQL_TYPE_VAR_STRING should not
>>> +    appear inside the server at all.
>>> +  */
>>
>> Can you elaborate on this?
> 
> MYSQL_TYPE_DECIMAL and MYSQL_TYPE_VAR_STRING are old types and cannot be
> constructed in a new server, only by reading definitions from old frm files.
> They are present in the static code, so they should probably be handled anyway.
> 
> Added code to handle them as well as can be done.
> 
> [snip]
> 
>>> +TABLE *table_def::create_conversion_table(THD *thd, Relay_log_info *rli,
> TABLE *target_table) const
> 
> [snip]
> 
>>> +    DBUG_PRINT("debug", ("sql_type: %d, max_length: %d, decimals: %d,"
>>> +                         " maybe_null: %d, unsigned_flag: %d, pack_length:
> %u",
>>> +                         type(col), max_length, decimals, TRUE, FALSE,
> pack_length));
>>> +    field_def->init_for_tmp_table(type(col),
>>> +                                  max_length,
>>> +                                  decimals,
>>> +                                  TRUE,         // maybe_null
>>> +                                  FALSE,        // unsigned_flag
>>> +                                  pack_length);
>>
>> Shouldn't you check if the field is unsigned? Null?
> 
> It is not possible to get information about whether the field is unsigned (there
> is no information in the table map event about that) and using NULL is not
> strictly necessary for the temporary table since no type comparison is made
> between it and the target table.
> 
> I added the null flag anyway, since it is available.

Ok.

> 
> [snip]
> 
>>> +table_def::table_def(unsigned char *types, ulong size,
>>> +                     uchar *field_metadata, int metadata_size,
>>> +                     uchar *null_bitmap)
>>> +  : m_size(size), m_type(0), m_field_metadata_size(metadata_size),
>>> +    m_field_metadata(0), m_null_bits(0), m_memory(NULL)
>> Why do you initialize some fields in here if you do it explicetly below?
> 
> I just moved the constructor from the header file.
> 
> It is also useful to catch bugs, and the optimizer will eliminate the redundant
> assignment anyway.


Ok.

> 
> [snip]
> 
>>> +      case MYSQL_TYPE_BIT:
>>> +      {
>>
>>> +        uint16 x= field_metadata[index++]; 
>>> +        x = x + (field_metadata[index++] << 8U);
>>
>> Is this a mistake? Note that the most significative byte is in the second part of
> the array.
>> In the other cases, it is always in the first part.
> 
> Eh, not sure I follow you.

When there is two bytes to represent metadata, the most significant bits come first.

uint16 x= field_metadata[index++] << 8U.
x = x + field_metadata[index++];


> 
> This code is copied from the header file.
> 
>>> +        m_field_metadata[i]= x;
>>> +        break;
>>> +      }
>>> +      case MYSQL_TYPE_VARCHAR:
>>> +      {
>>
>>> +        /*
>>> +          These types store two bytes.
>>> +        */
>>> +        char *ptr= (char *)&field_metadata[index];
>>> +        m_field_metadata[i]= uint2korr(ptr);
>>> +        index= index + 2;
>>
>> I was wondering if this may not generate any architecture issue.

I just tired. Forget about it.

> 
> Please elaborate.
> 
> This code was just copied from the header file.
> 
> [snip]
> 
>>> +
>>> +table_def::~table_def()
>>> +{
>>> +  my_free(m_memory, MYF(0));
>>> +#ifndef DBUG_OFF
>>> +  m_type= 0;
>>> +  m_size= 0;
>>> +#endif
>> What are you trying to prevent with this?
> 
> Discovering memory released twice in debug builds.


Sorry, I still did not get how this can be useful to check if the memory is released
twice.

> 
> This code is copied from the header file.
> 
> [snip]
> 
>>> @@ -1261,7 +1294,6 @@ bool sys_var_thd_binlog_format::check(TH
>>>    return result;
>>>  }
>>>  
>>> -
>>>  bool sys_var_thd_binlog_format::is_readonly() const
>>>  {
>>>    /*
>>>
>>
>> Please, remove this hunk.
> 
> OK.
> 
Thread
bzr commit into mysql-5.1 branch (mats:3180) WL#5151Mats Kindahl4 Dec
  • Re: bzr commit into mysql-5.1 branch (mats:3180) WL#5151Alfranio Correia10 Dec
    • Re: bzr commit into mysql-5.1 branch (mats:3180) WL#5151Sergei Golubchik10 Dec
    • Re: bzr commit into mysql-5.1 branch (mats:3180) WL#5151Mats Kindahl10 Dec
      • Re: bzr commit into mysql-5.1 branch (mats:3180) WL#5151Alfranio Correia10 Dec
        • Re: bzr commit into mysql-5.1 branch (mats:3180) WL#5151Mats Kindahl10 Dec
  • Re: bzr commit into mysql-5.1 branch (mats:3180) WL#5151Luís Soares10 Dec
    • Re: bzr commit into mysql-5.1 branch (mats:3180) WL#5151Mats Kindahl10 Dec