List:Internals« Previous MessageNext Message »
From:Andrei Elkin Date:May 11 2011 10:25am
Subject:Re: WL#2540, binlog checksums, interoperability between different versions
View as plain text  
Kristian, Mats, hello.

I can respond now after discussing the version issue with Mats on irc.

I vaguely recall now, that the version was adapted also because of somewhat
better protection from a use case of the flag OFF due to corruption.
The corrupted FD was planned as a separate task.

Yes, the version check is not much different from a flag that Kristian is
So the checksum-aware master might just turn ON the new common header
checksum-flag in FD event additionally to written algorithm descriptor and
the checksum value.
The slave would check the flag value instead of 
`if ((ver_calc= get_version_product()) >= checksum_version_product)'.
That must be easy change.

However, I suggest to stick to the original design of keeping a binlog file
homogeneous in sense all events (except FD that is always CRC32:d) 
are checksummed with the same algorithm as specfied in FD (the NONE alg incl).
Wrt the flag per event, that could allow a corrupted-also-in-the-flag-part
event be executed, with all harmful implications...

knielsen@stripped (Kristian Nielsen) writes:

> Mats Kindahl <mats.kindahl@stripped> writes:
>> Hmm... not sure what code you're looking at, but I'm looking at
>> line 1696 (or so) and AFAICT, there is no version
>> checking there. There are some version checking to give errors for
>> some specific bugs later in the file (line 5393 or so), but that does
>> not have anything to do with the checksum implementation.
> I am looking at latest lp:mysql-server (as of today),
> Format_description_log_event() constructor. I checked that the code is the
> same in the MySQL 5.6.2-m5 source tarball on (This is
> just MySQL code, no MariaDB):
>   calc_server_version_split();
>   if ((ver_calc= get_version_product()) >= checksum_version_product)
>   {
>     /* the last bytes are the checksum alg desc and value (or value's room) */
>     number_of_event_types -= BINLOG_CHECKSUM_ALG_DESC_LEN;
>     /*
>       FD from the checksum-home version server (ver_calc ==
>       checksum_version_product) must have 
>       number_of_event_types == LOG_EVENT_TYPES.
>     */
>     DBUG_ASSERT(ver_calc != checksum_version_product ||
>                 number_of_event_types == LOG_EVENT_TYPES);
>     checksum_alg= post_header_len[number_of_event_types];
>   }
>   else
>   {
>     checksum_alg= (uint8) BINLOG_CHECKSUM_ALG_UNDEF;
>   }
> This code will set checksum_alg to BINLOG_CHECKSUM_ALG_UNDEF if the server
> version in the received format description event is < 5.6.1.
> And then later in read_log_event():
>     if (alg != BINLOG_CHECKSUM_ALG_UNDEF &&
>         (event_type == FORMAT_DESCRIPTION_EVENT ||
>          alg != BINLOG_CHECKSUM_ALG_OFF))
>       event_len= event_len - BINLOG_CHECKSUM_LEN;
> The result is that if the master sends a format_description_event with server
> version string < 5.6.1 followed by an event with checksum at the end, the
> slave will incorrectly interpret the 4 bytes of checksum as part of the event
> payload (the above subtraction from event_len will not be run).
> Do you see the problem? The slave requests to receive checksums, the master
> sends them back, but then the slave looks at the format_description event, and
> decides that events are without checksums. The result is that eg. checksums
> are interpreted as extra garbage bytes at end of query strings etc.
>> in general, tying any specific property of the server to a version
>> number is likely to cause problems. One you have already discovered
>> (you would need to change the Oracle MySQL server), but there are some
>> other issues as well. In general, it goes something like this:
>>     * The server has some set of properties taken from P = { A, B, C,
>>       ... }
>>     * The server has a version
>> Now, if you use the version number V to figure out the set of
>> properties {A, B, C}, you do this in an indirect manner based on the
>> assumption that the mapping 'properties :: V --> 2^P' is static, but
>> in general it cannot be. Consider these cases:
>>     * Some property is in a plugin and loaded dynamically. In that
>>       case, you cannot look at the server version to decide if it has
>>       property A.
>>     * Server version numbering scheme changes (Just an example is
>>       where the server is forked. Even inside Oracle, there are  two
>>       lines of development of the server with NDB on one branch and
>>       the main server on another.) In this case, you need to do
>>       something like what you are outlining above to be able to deduce
>>       what property a server supports based on the version number.
>>     * Server version numbering is not monotone, that is, some
>>       properties are removed from later servers. This would require
>>       you to hard-code an elaborate scheme for deciding if a server
>>       has a certain property based on the version of the server.
>> Having a direct scheme, asking the question "do you have property A"
>> is superior to asking "what version is you" and then deducing using
>> private notes that the server should have property A.
> Yes, I fully agree :-) And good point about NDB, I would expect there will be
> problems replicating with checksums enabled just between mysql and telco when
> wl#2540 is merged into -telco.

Well, telco could modify `checksum_version_product' and stay with the
version check. Are not they replicate exclusively inside telco versions?

> Do you see now from above explanation where the WL#2540 in current MySQL
> lp:mysql-server source is using version to detect capability?
> In this case the problem is slightly complicated, as we do not only need the
> capabilities of the server sending the event, we need the capabilities of the
> server that wrote the event. Eg. if MySQL 5.6.2 is sending old events written
> by MySQL 5.5.8 before upgrading. So it is not enough for slave to query
> capabilities of the running server, it needs also to check capabilities of the
> server that wrote the events.
> As a suggestion on how to improve this, how about adding a new event flag?
>     #define LOG_EVENT_CHECKSUMMED_F 0x200
> This flag will be set on events that have checksums. Then there is no need to
> look at previous format_description_event to guess if current event has
> checksum embedded or not, just look at the flag. And a lot of scary logic
> related to relay logs where some events are checksummed and some are not can
> be eliminated.
> What do you think?
>  - Kristian.


WL#2540, binlog checksums, interoperability between different versionsKristian Nielsen9 May
  • Re: WL#2540, binlog checksums, interoperability between differentversionsMats Kindahl9 May
    • Re: WL#2540, binlog checksums, interoperability between different versionsKristian Nielsen9 May
      • Re: WL#2540, binlog checksums, interoperability between differentversionsMats Kindahl13 May
  • Re: WL#2540, binlog checksums, interoperability between different versionsAndrei Elkin11 May