From: Mats Kindahl Date: May 9 2011 9:41am Subject: Re: WL#2540, binlog checksums, interoperability between different versions List-Archive: http://lists.mysql.com/internals/38319 Message-Id: <4DC7B6C2.3030805@oracle.com> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit On 05/09/2011 09:19 AM, Kristian Nielsen wrote: > In MariaDB, we backported the WL#2540 (binlog checksums) from MySQL 5.6 into > MariaDB 5.3. Due to the way WL#2540 is designed, this requires some changes in > MySQL 5.6 to fully interoperate with MariaDB. So I wanted to make MySQL@Oracle > developers aware of this issue. > > This WL#2540 feature is a bit special, in that the slave needs to parse the > server version string in the format description event to correctly interpret > the following event data. Hmmm... no, not really. > So the problem occurs when a MySQL 5.6 slave replicates against a MariaDB 5.3 > (or any version < 5.6.1, which is the point at which the feature is considered > enabled in MySQL 5.6 code). Currently, the MySQL 5.6 slave will notify the > master that it understands checksums, but the slave does not realise that the > master also understands checksums. So the MySQL 5.6 slave will wrongly > interpret the last four checksum bytes as part of the payload of events. Which > is not good. Well... the handshake used in 5.6 goes like this: * Slave sends the statement "SET @master_binlog_checksum= @@global.binlog_checksum" to the master. o If this gives an error (binlog_checksum not found), there is no checksum support in the master. o If this succeeds, master has checksum support. * Slave sends "SELECT @master_binlog_checksum" to read the checksum used by the master. * The master_binlog_checksum variable is then read by the dump thread on the master to decide if a checksum should be generated. > In MariaDB, I handle this issue by having two version points, one for MariaDB > (where checksums are supported for >=5.3.0), and one for MySQL (>= 5.6.1). So > MariaDB slave on MySQL master works, but the other direction requires changes > in MySQL that I cannot do, of course. See lp:maria/5.3, file sql/log_event.cc, > function Format_description_log_event::is_version_before_checksum() and > related code for how this should be done, if interested. Hmm... not sure what code you're looking at, but I'm looking at rpl_slave.cc line 1696 (or so) and AFAICT, there is no version checking there. There are some version checking to give errors for some specific bugs later in the file (line 5393 or so), but that does not have anything to do with the checksum implementation. in general, tying any specific property of the server to a version number is likely to cause problems. One you have already discovered (you would need to change the Oracle MySQL server), but there are some other issues as well. In general, it goes something like this: * The server has some set of properties taken from P = { A, B, C, ... } * The server has a version Now, if you use the version number V to figure out the set of properties {A, B, C}, you do this in an indirect manner based on the assumption that the mapping 'properties :: V --> 2^P' is static, but in general it cannot be. Consider these cases: * Some property is in a plugin and loaded dynamically. In that case, you cannot look at the server version to decide if it has property A. * Server version numbering scheme changes (Just an example is where the server is forked. Even inside Oracle, there are two lines of development of the server with NDB on one branch and the main server on another.) In this case, you need to do something like what you are outlining above to be able to deduce what property a server supports based on the version number. * Server version numbering is not monotone, that is, some properties are removed from later servers. This would require you to hard-code an elaborate scheme for deciding if a server has a certain property based on the version of the server. Having a direct scheme, asking the question "do you have property A" is superior to asking "what version is you" and then deducing using private notes that the server should have property A. > There is some more documentation of the interoperability issues in MWL#180: > > http://askmonty.org/worklog/Server-Sprint/?tid=180 > > But the only issue that requires changes in MySQL (that I know of) is this > MariaDB master -> MySQL slave. Not sure what other issues you are considering: the only interoperability issue you discuss is about replicating between servers with or without checksums. > Now, it could be argued that it is rather inconvenient that a slave needs to > understand which versions of every master out there implements WL#2540, and I > would agree :-). So if there is interest we could discuss ways to change the > WL#2540 design to detect presense or absense of checksums in a better way, I > will be happy to participate in such discussion. But for now I just wanted to > make MySQL@Oracle developers aware of the issue so they have a chance to make > the case MariaDB master -> MySQL slave work, if they want. I think it will work fine if the MariaDB server just implement the scheme described above. Reading the code, I see that there is one case that is not correctly handled and that is if the master is used a checksumming scheme that the slave does not know about, it might behave bad. I reported it as BUG#61096 (http://bugs.mysql.com/bug.php?id=61096). Just my few cents, Mats Kindahl