On 05/09/2011 09:19 AM, Kristian Nielsen wrote:
> In MariaDB, we backported the WL#2540 (binlog checksums) from MySQL 5.6 into
> MariaDB 5.3. Due to the way WL#2540 is designed, this requires some changes in
> MySQL 5.6 to fully interoperate with MariaDB. So I wanted to make MySQL@Oracle
> developers aware of this issue.
>
> This WL#2540 feature is a bit special, in that the slave needs to parse the
> server version string in the format description event to correctly interpret
> the following event data.
Hmmm... no, not really.
> So the problem occurs when a MySQL 5.6 slave replicates against a MariaDB 5.3
> (or any version < 5.6.1, which is the point at which the feature is considered
> enabled in MySQL 5.6 code). Currently, the MySQL 5.6 slave will notify the
> master that it understands checksums, but the slave does not realise that the
> master also understands checksums. So the MySQL 5.6 slave will wrongly
> interpret the last four checksum bytes as part of the payload of events. Which
> is not good.
Well... the handshake used in 5.6 goes like this:
* Slave sends the statement "SET @master_binlog_checksum=
@@global.binlog_checksum" to the master.
o If this gives an error (binlog_checksum not found), there
is no checksum support in the master.
o If this succeeds, master has checksum support.
* Slave sends "SELECT @master_binlog_checksum" to read the
checksum used by the master.
* The master_binlog_checksum variable is then read by the dump
thread on the master to decide if a checksum should be generated.
> In MariaDB, I handle this issue by having two version points, one for MariaDB
> (where checksums are supported for >=5.3.0), and one for MySQL (>= 5.6.1). So
> MariaDB slave on MySQL master works, but the other direction requires changes
> in MySQL that I cannot do, of course. See lp:maria/5.3, file sql/log_event.cc,
> function Format_description_log_event::is_version_before_checksum() and
> related code for how this should be done, if interested.
Hmm... not sure what code you're looking at, but I'm looking at
rpl_slave.cc line 1696 (or so) and AFAICT, there is no version
checking there. There are some version checking to give errors for
some specific bugs later in the file (line 5393 or so), but that does
not have anything to do with the checksum implementation.
in general, tying any specific property of the server to a version
number is likely to cause problems. One you have already discovered
(you would need to change the Oracle MySQL server), but there are some
other issues as well. In general, it goes something like this:
* The server has some set of properties taken from P = { A, B, C,
... }
* The server has a version
Now, if you use the version number V to figure out the set of
properties {A, B, C}, you do this in an indirect manner based on the
assumption that the mapping 'properties :: V --> 2^P' is static, but
in general it cannot be. Consider these cases:
* Some property is in a plugin and loaded dynamically. In that
case, you cannot look at the server version to decide if it has
property A.
* Server version numbering scheme changes (Just an example is
where the server is forked. Even inside Oracle, there are two
lines of development of the server with NDB on one branch and
the main server on another.) In this case, you need to do
something like what you are outlining above to be able to deduce
what property a server supports based on the version number.
* Server version numbering is not monotone, that is, some
properties are removed from later servers. This would require
you to hard-code an elaborate scheme for deciding if a server
has a certain property based on the version of the server.
Having a direct scheme, asking the question "do you have property A"
is superior to asking "what version is you" and then deducing using
private notes that the server should have property A.
> There is some more documentation of the interoperability issues in MWL#180:
>
> http://askmonty.org/worklog/Server-Sprint/?tid=180
>
> But the only issue that requires changes in MySQL (that I know of) is this
> MariaDB master -> MySQL slave.
Not sure what other issues you are considering: the only
interoperability issue you discuss is about replicating between
servers with or without checksums.
> Now, it could be argued that it is rather inconvenient that a slave needs to
> understand which versions of every master out there implements WL#2540, and I
> would agree :-). So if there is interest we could discuss ways to change the
> WL#2540 design to detect presense or absense of checksums in a better way, I
> will be happy to participate in such discussion. But for now I just wanted to
> make MySQL@Oracle developers aware of the issue so they have a chance to make
> the case MariaDB master -> MySQL slave work, if they want.
I think it will work fine if the MariaDB server just implement the
scheme described above. Reading the code, I see that there is one case
that is not correctly handled and that is if the master is used a
checksumming scheme that the slave does not know about, it might
behave bad.
I reported it as BUG#61096 (http://bugs.mysql.com/bug.php?id=61096).
Just my few cents,
Mats Kindahl
| Thread |
|---|
| • WL#2540, binlog checksums, interoperability between different versions | Kristian Nielsen | 9 May |
| • Re: WL#2540, binlog checksums, interoperability between differentversions | Mats Kindahl | 9 May |
| • Re: WL#2540, binlog checksums, interoperability between different versions | Kristian Nielsen | 9 May |
| • Re: WL#2540, binlog checksums, interoperability between differentversions | Mats Kindahl | 13 May |
| • Re: WL#2540, binlog checksums, interoperability between different versions | Andrei Elkin | 11 May |